Assignment 2¶

Contents:

  • Setup
    • Config
    • Modules
  • Comparison of an MLP and a CNN on the SVHN dataset
    • Training the MLP
      • Optimization of hyperparameters
      • Training
      • Evaluation
    • Training the CNN
      • Optimization of hyperparameters
      • Training
      • Evaluation
    • Discussion
  • Visualization of convolutional kernels (weights) and activations (feature maps)
    • Weights
    • Feature maps
    • Discussion
  • Comparison of regularization methods
    • No regularization
    • $\mathcal{L}_2$ regularization
    • $\mathcal{L}_1$ regularization
    • Discussion
  • Evaluation of custom learning rate warmup and learning rate scheduler
    • Training
    • Evaluation
    • Discussion
  • Extra Point
    • Train and evaluate a shallow MLP-Mixer model
    • Compare it with the best MLP model from before. Does it work better? Why or why not?

Setup¶

Config¶

In [ ]:
import assignment.config as config
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/assignment/config.yaml
In [ ]:
config.list_available()
Out[ ]:
['svhn_cnn', 'svhn_cnn_l1', 'svhn_cnn_l2', 'svhn_mlp']

Modules¶

In [ ]:
from pathlib import Path

import matplotlib.pyplot as plt
import numpy as np
import optuna
import torch
import torchsummary
import torchvision.models.feature_extraction as feature_extraction

import assignment.scripts.init_exp as init_exp
from assignment.evaluation.evaluator import Evaluator
from assignment.optimization_hyperparams.optimizer_hyperparams import OptimizerHyperparams
from assignment.training.trainer import Trainer
import assignment.libs.utils_checkpoints as utils_checkpoints
import assignment.libs.utils_data as utils_data
import assignment.libs.utils_model as utils_model
import assignment.libs.utils_optuna as utils_optuna
import assignment.visualization.plot as plot
import assignment.visualization.visualize as visualize

Comparison of an MLP and a CNN on the SVHN dataset¶

Training the MLP¶

In [ ]:
path_dir_exp_mlp = Path(config._PATH_DIR_EXPS) / "svhn_mlp"

init_exp.init_exp(name_exp="svhn_mlp", name_config="svhn_mlp")
config.set_config_exp(path_dir_exp_mlp)
Initializing experiment svhn_mlp...
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_mlp
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_mlp/checkpoints
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_mlp/logs
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_mlp/plots
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_mlp/visualizations
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/assignment/configs/svhn_mlp.yaml
Config saved to /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_mlp/config.yaml
Initializing experiment svhn_mlp finished
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_mlp/config.yaml

Optimization of hyperparameters¶

In [ ]:
optimizer_hyperparams_mlp = OptimizerHyperparams(name_exp="svhn_mlp")
optimizer_hyperparams_mlp.create_study(load_if_exists=True)
optimizer_hyperparams_mlp.optimize(num_epochs=10, num_trials=50)
Creating study...
[I 2024-05-15 18:39:51,893] A new study created in RDB with name: study
Creating study finished
Optimizing...
    Trials    : 50
    Epochs    : 10
[I 2024-05-15 18:40:21,198] Trial 0 finished with value: 0.7606470106470107 and parameters: {'Learning rate': 0.00023775377441803875, 'Usage of bias': True, 'Dropout probability': 0.030895252767945934, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'ReLU', 'Hidden dimensions': '[2048]'}. Best is trial 0 with value: 0.7606470106470107.
[I 2024-05-15 18:40:50,220] Trial 1 finished with value: 0.7785967785967786 and parameters: {'Learning rate': 0.004734528845954143, 'Usage of bias': False, 'Dropout probability': 0.08701349395307593, 'Normalization layer': 'LayerNorm', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[1024, 256]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:41:19,009] Trial 2 finished with value: 0.7548457548457549 and parameters: {'Learning rate': 0.00021933822745783503, 'Usage of bias': False, 'Dropout probability': 0.1997398041388372, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'ReLU', 'Hidden dimensions': '[512, 128, 32]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:41:47,877] Trial 3 finished with value: 0.23048048048048048 and parameters: {'Learning rate': 0.00015962830873928016, 'Usage of bias': True, 'Dropout probability': 0.09391289995038264, 'Normalization layer': None, 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:42:16,928] Trial 4 finished with value: 0.741946491946492 and parameters: {'Learning rate': 0.0003363096125279802, 'Usage of bias': False, 'Dropout probability': 0.24615396397674855, 'Normalization layer': 'LayerNorm', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[1024, 256]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:42:45,919] Trial 5 finished with value: 0.7594867594867595 and parameters: {'Learning rate': 0.00032270637031916696, 'Usage of bias': True, 'Dropout probability': 0.00746337105095356, 'Normalization layer': 'LayerNorm', 'Activation layer': 'ReLU', 'Hidden dimensions': '[512, 128]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:43:14,814] Trial 6 finished with value: 0.7570980070980071 and parameters: {'Learning rate': 1.6531884448980347e-05, 'Usage of bias': False, 'Dropout probability': 0.10410958879480603, 'Normalization layer': 'LayerNorm', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[512, 128]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:43:43,887] Trial 7 finished with value: 0.7742287742287742 and parameters: {'Learning rate': 0.006932820557210159, 'Usage of bias': False, 'Dropout probability': 0.2337152328433562, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:44:13,190] Trial 8 finished with value: 0.734029484029484 and parameters: {'Learning rate': 1.240350057307464e-05, 'Usage of bias': False, 'Dropout probability': 0.13030309463422815, 'Normalization layer': 'LayerNorm', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[1024, 256, 64]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:44:42,194] Trial 9 finished with value: 0.75 and parameters: {'Learning rate': 1.4836170617604084e-05, 'Usage of bias': True, 'Dropout probability': 0.24827073925739368, 'Normalization layer': 'LayerNorm', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[1024, 256]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:45:11,301] Trial 10 finished with value: 0.424993174993175 and parameters: {'Learning rate': 0.009305251845255588, 'Usage of bias': False, 'Dropout probability': 0.056924628378950695, 'Normalization layer': None, 'Activation layer': 'ReLU', 'Hidden dimensions': '[128, 32]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:45:40,366] Trial 11 finished with value: 0.7622850122850123 and parameters: {'Learning rate': 0.009266047994526634, 'Usage of bias': False, 'Dropout probability': 0.2982808537493725, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:46:09,723] Trial 12 finished with value: 0.766994266994267 and parameters: {'Learning rate': 0.0022676715584426382, 'Usage of bias': False, 'Dropout probability': 0.18331463883846175, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[256]'}. Best is trial 1 with value: 0.7785967785967786.
[I 2024-05-15 18:46:39,057] Trial 13 finished with value: 0.8008463008463008 and parameters: {'Learning rate': 0.0021515943851436884, 'Usage of bias': False, 'Dropout probability': 0.17078937417317805, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 13 with value: 0.8008463008463008.
[I 2024-05-15 18:47:08,127] Trial 14 finished with value: 0.624010374010374 and parameters: {'Learning rate': 0.0017177108410743227, 'Usage of bias': False, 'Dropout probability': 0.15928893405272299, 'Normalization layer': None, 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[512]'}. Best is trial 13 with value: 0.8008463008463008.
[I 2024-05-15 18:47:36,886] Trial 15 finished with value: 0.6485121485121486 and parameters: {'Learning rate': 0.0018673819118427168, 'Usage of bias': False, 'Dropout probability': 0.06728413032091309, 'Normalization layer': 'LayerNorm', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[64]'}. Best is trial 13 with value: 0.8008463008463008.
[I 2024-05-15 18:48:06,252] Trial 16 finished with value: 0.8218673218673219 and parameters: {'Learning rate': 0.0006592586229589168, 'Usage of bias': False, 'Dropout probability': 0.1233047521881071, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:48:35,292] Trial 17 finished with value: 0.8185230685230686 and parameters: {'Learning rate': 0.0007836764144585606, 'Usage of bias': False, 'Dropout probability': 0.13670865214772776, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:49:04,408] Trial 18 finished with value: 0.7982527982527983 and parameters: {'Learning rate': 0.0008298745205895834, 'Usage of bias': True, 'Dropout probability': 0.12578624896347781, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'ReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:49:33,575] Trial 19 finished with value: 0.8196833196833196 and parameters: {'Learning rate': 4.983423895726113e-05, 'Usage of bias': False, 'Dropout probability': 0.1254505773673659, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:50:02,300] Trial 20 finished with value: 0.7982527982527983 and parameters: {'Learning rate': 4.7055158741274184e-05, 'Usage of bias': False, 'Dropout probability': 0.043026921091084136, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:50:31,772] Trial 21 finished with value: 0.8153835653835654 and parameters: {'Learning rate': 8.163967335502179e-05, 'Usage of bias': False, 'Dropout probability': 0.1428780641278338, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:51:01,470] Trial 22 finished with value: 0.7727272727272727 and parameters: {'Learning rate': 0.000667989611814456, 'Usage of bias': False, 'Dropout probability': 0.1267829537189608, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:51:30,939] Trial 23 finished with value: 0.7992082992082992 and parameters: {'Learning rate': 0.0006026107110668222, 'Usage of bias': False, 'Dropout probability': 0.19824525859285494, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:51:59,821] Trial 24 finished with value: 0.7656292656292656 and parameters: {'Learning rate': 3.509283733901235e-05, 'Usage of bias': False, 'Dropout probability': 0.11690236962744396, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[256]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:52:29,276] Trial 25 finished with value: 0.23532623532623534 and parameters: {'Learning rate': 9.950478243169402e-05, 'Usage of bias': False, 'Dropout probability': 0.07843744442856497, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:52:58,832] Trial 26 finished with value: 0.34875784875784877 and parameters: {'Learning rate': 0.0008864390427819463, 'Usage of bias': True, 'Dropout probability': 0.15386237946981063, 'Normalization layer': None, 'Activation layer': 'ReLU', 'Hidden dimensions': '[128, 32]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:53:27,940] Trial 27 finished with value: 0.6730821730821731 and parameters: {'Learning rate': 0.0012063421487291824, 'Usage of bias': False, 'Dropout probability': 0.11145209721243535, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[64]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:53:57,322] Trial 28 finished with value: 0.7734780234780235 and parameters: {'Learning rate': 0.000559126045633855, 'Usage of bias': False, 'Dropout probability': 0.2129874505165566, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[512, 128, 32]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:54:26,554] Trial 29 finished with value: 0.7654927654927655 and parameters: {'Learning rate': 0.00013689573286386524, 'Usage of bias': True, 'Dropout probability': 0.14806018767964133, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'ReLU', 'Hidden dimensions': '[512]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:54:55,453] Trial 30 finished with value: 0.7815997815997816 and parameters: {'Learning rate': 0.003545677112875742, 'Usage of bias': False, 'Dropout probability': 0.036861078381225224, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[1024, 256, 64]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:55:25,390] Trial 31 finished with value: 0.7872645372645373 and parameters: {'Learning rate': 8.377728623662905e-05, 'Usage of bias': False, 'Dropout probability': 0.1348256731971776, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:55:54,736] Trial 32 finished with value: 0.7893120393120393 and parameters: {'Learning rate': 3.5784507496755064e-05, 'Usage of bias': False, 'Dropout probability': 0.17113361658124776, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:56:24,313] Trial 33 finished with value: 0.7841250341250341 and parameters: {'Learning rate': 6.275018740160034e-05, 'Usage of bias': False, 'Dropout probability': 0.09824239728215128, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:56:53,316] Trial 34 finished with value: 0.8017335517335518 and parameters: {'Learning rate': 2.4486631289673608e-05, 'Usage of bias': False, 'Dropout probability': 0.14824142834483844, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:57:22,884] Trial 35 finished with value: 0.7012694512694513 and parameters: {'Learning rate': 0.0003835302032543184, 'Usage of bias': False, 'Dropout probability': 0.07670767434059664, 'Normalization layer': None, 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:57:52,692] Trial 36 finished with value: 0.7531395031395032 and parameters: {'Learning rate': 0.00021417247287846476, 'Usage of bias': False, 'Dropout probability': 0.08777442459681692, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'ReLU', 'Hidden dimensions': '[512, 128, 32]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:58:21,966] Trial 37 finished with value: 0.8032350532350533 and parameters: {'Learning rate': 0.00017668885447954264, 'Usage of bias': True, 'Dropout probability': 0.1860395491216686, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:58:51,377] Trial 38 finished with value: 0.2371007371007371 and parameters: {'Learning rate': 0.0003486079804346477, 'Usage of bias': False, 'Dropout probability': 0.13448382620165733, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:59:20,433] Trial 39 finished with value: 0.6340431340431341 and parameters: {'Learning rate': 0.00010654248992106639, 'Usage of bias': False, 'Dropout probability': 0.00730196663257815, 'Normalization layer': None, 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 18:59:49,908] Trial 40 finished with value: 0.8024160524160524 and parameters: {'Learning rate': 2.1028677164192212e-05, 'Usage of bias': True, 'Dropout probability': 0.10049667310491042, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'ReLU', 'Hidden dimensions': '[1024, 256]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:00:19,518] Trial 41 finished with value: 0.7745700245700246 and parameters: {'Learning rate': 0.00017523131063200919, 'Usage of bias': True, 'Dropout probability': 0.18816787383078176, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:00:48,972] Trial 42 finished with value: 0.8022113022113022 and parameters: {'Learning rate': 0.00025315444232247336, 'Usage of bias': True, 'Dropout probability': 0.22486988971597555, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:01:18,384] Trial 43 finished with value: 0.7573710073710074 and parameters: {'Learning rate': 0.0004522607892040078, 'Usage of bias': True, 'Dropout probability': 0.16700914858964888, 'Normalization layer': 'LayerNorm', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:01:47,851] Trial 44 finished with value: 0.7920420420420421 and parameters: {'Learning rate': 0.00014642931489935163, 'Usage of bias': True, 'Dropout probability': 0.18418978805084923, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:02:17,099] Trial 45 finished with value: 0.7971607971607971 and parameters: {'Learning rate': 6.76049376783686e-05, 'Usage of bias': True, 'Dropout probability': 0.14245339874748553, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[1024, 256, 64]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:02:46,510] Trial 46 finished with value: 0.7684957684957685 and parameters: {'Learning rate': 0.0002481070497322132, 'Usage of bias': False, 'Dropout probability': 0.2702104053627564, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:03:15,665] Trial 47 finished with value: 0.6734916734916735 and parameters: {'Learning rate': 0.0012352123092370158, 'Usage of bias': False, 'Dropout probability': 0.11569866538550583, 'Normalization layer': 'LayerNorm', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[128, 32]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:03:45,038] Trial 48 finished with value: 0.7727955227955228 and parameters: {'Learning rate': 4.77053269449618e-05, 'Usage of bias': True, 'Dropout probability': 0.20614143069141466, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
[I 2024-05-15 19:04:13,906] Trial 49 finished with value: 0.8133360633360633 and parameters: {'Learning rate': 0.00011570852363223616, 'Usage of bias': False, 'Dropout probability': 0.16082793975328613, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}. Best is trial 16 with value: 0.8218673218673219.
Study results
    Trials finished: 50
    Trials completed: 50
Best trial
    Number    : 16
    Value     : 0.8218673218673219
    Params    : {'Learning rate': 0.0006592586229589168, 'Usage of bias': False, 'Dropout probability': 0.1233047521881071, 'Normalization layer': 'BatchNorm1d', 'Activation layer': 'LeakyReLU', 'Hidden dimensions': '[2048, 512, 128]'}
Optimizing finished
In [ ]:
study_mlp = utils_optuna.load_study(path_db=path_dir_exp_mlp / "optuna.db")
name_target_mlp = config.OPTIMIZATION_HYPERPARAMS["metric"]

print(f"Best {name_target_mlp}: {study_mlp.best_value}")
print(f"Best parameters")
for param, value in study_mlp.best_params.items():
    print(f"    {param:<30}: {value}")
Best Accuracy: 0.8218673218673219
Best parameters
    Learning rate                 : 0.0006592586229589168
    Usage of bias                 : False
    Dropout probability           : 0.1233047521881071
    Normalization layer           : BatchNorm1d
    Activation layer              : LeakyReLU
    Hidden dimensions             : [2048, 512, 128]
In [ ]:
optuna.visualization.plot_slice(study_mlp, target_name=name_target_mlp).show()
optuna.visualization.plot_param_importances(study_mlp, target_name=name_target_mlp).show()
optuna.visualization.plot_rank(study_mlp, target_name=name_target_mlp).update_layout(width=1600, height=1600).show()
/tmp/ipykernel_2849573/1680062555.py:3: ExperimentalWarning:

plot_rank is experimental (supported from v3.2.0). The interface can change in the future.

These results seem sensible. A larger model performed best in the study. However, the advantage seems not too large. For reasons of efficiency, I chose hidden dimensions of [1024, 256] which performed reasonable as well. Also, I chose to use bias as it does no harm and not using the bias seems odd to me. The experiment config file has been updated according to the best parameters printed above.

Training¶

In [ ]:
trainer_mlp = Trainer("svhn_mlp")
trainer_mlp.loop(config.TRAINING["num_epochs"])
Setting up dataloaders...
Train dataset
Dataset SVHN
    Number of datapoints: 58605
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: train
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Validate dataset
Dataset SVHN
    Number of datapoints: 14652
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: validate
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloaders finished
Setting up model...
Model
MLP(
  (head): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=3072, out_features=1024, bias=True)
    (2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): ReLU()
    (4): Dropout(p=0.0325, inplace=False)
    (5): Linear(in_features=1024, out_features=256, bias=True)
    (6): BatchNorm1d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): ReLU()
    (8): Dropout(p=0.0325, inplace=False)
    (9): Linear(in_features=256, out_features=64, bias=True)
    (10): BatchNorm1d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (11): ReLU()
    (12): Dropout(p=0.0325, inplace=False)
    (13): Linear(in_features=64, out_features=10, bias=True)
  )
)
Setting up model finished
Setting up optimizer...
Setting up optimizer finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Looping...
Validating: Epoch 000 | Batch 030 | Loss 2.29769:  40%|███▉      | 23/58 [00:00<00:00, 73.86it/s]
Validating: Epoch 000 | Batch 050 | Loss 2.29746: 100%|██████████| 58/58 [00:00<00:00, 91.32it/s] 
Training: Epoch 001 | Batch 220 | Loss 1.44884: 100%|██████████| 229/229 [00:01<00:00, 116.14it/s]
Validating: Epoch 001 | Batch 050 | Loss 1.48258: 100%|██████████| 58/58 [00:00<00:00, 95.59it/s] 
Training: Epoch 002 | Batch 220 | Loss 1.14397: 100%|██████████| 229/229 [00:01<00:00, 115.30it/s]
Validating: Epoch 002 | Batch 050 | Loss 1.21703: 100%|██████████| 58/58 [00:00<00:00, 101.90it/s]
Training: Epoch 003 | Batch 220 | Loss 1.02277: 100%|██████████| 229/229 [00:01<00:00, 115.03it/s]
Validating: Epoch 003 | Batch 050 | Loss 1.08602: 100%|██████████| 58/58 [00:00<00:00, 95.52it/s] 
Training: Epoch 004 | Batch 220 | Loss 0.75710: 100%|██████████| 229/229 [00:01<00:00, 116.69it/s]
Validating: Epoch 004 | Batch 050 | Loss 0.96559: 100%|██████████| 58/58 [00:00<00:00, 99.26it/s] 
Training: Epoch 005 | Batch 220 | Loss 0.68744: 100%|██████████| 229/229 [00:01<00:00, 120.63it/s]
Validating: Epoch 005 | Batch 050 | Loss 1.01125: 100%|██████████| 58/58 [00:00<00:00, 91.62it/s] 
Training: Epoch 006 | Batch 220 | Loss 0.78213: 100%|██████████| 229/229 [00:02<00:00, 109.75it/s]
Validating: Epoch 006 | Batch 050 | Loss 0.83992: 100%|██████████| 58/58 [00:00<00:00, 92.89it/s] 
Training: Epoch 007 | Batch 220 | Loss 0.67919: 100%|██████████| 229/229 [00:02<00:00, 103.28it/s]
Validating: Epoch 007 | Batch 050 | Loss 1.00995: 100%|██████████| 58/58 [00:00<00:00, 96.48it/s] 
Training: Epoch 008 | Batch 220 | Loss 0.57942: 100%|██████████| 229/229 [00:02<00:00, 107.29it/s]
Validating: Epoch 008 | Batch 050 | Loss 0.77027: 100%|██████████| 58/58 [00:00<00:00, 103.02it/s]
Training: Epoch 009 | Batch 220 | Loss 0.55430: 100%|██████████| 229/229 [00:01<00:00, 116.98it/s]
Validating: Epoch 009 | Batch 050 | Loss 0.81621: 100%|██████████| 58/58 [00:00<00:00, 89.68it/s] 
Training: Epoch 010 | Batch 220 | Loss 0.44309: 100%|██████████| 229/229 [00:02<00:00, 111.92it/s]
Validating: Epoch 010 | Batch 050 | Loss 0.80743: 100%|██████████| 58/58 [00:00<00:00, 90.17it/s] 
Training: Epoch 011 | Batch 220 | Loss 0.60181: 100%|██████████| 229/229 [00:02<00:00, 107.98it/s]
Validating: Epoch 011 | Batch 050 | Loss 0.68887: 100%|██████████| 58/58 [00:00<00:00, 92.32it/s] 
Training: Epoch 012 | Batch 220 | Loss 0.45441: 100%|██████████| 229/229 [00:02<00:00, 110.06it/s]
Validating: Epoch 012 | Batch 050 | Loss 0.63143: 100%|██████████| 58/58 [00:00<00:00, 92.37it/s] 
Training: Epoch 013 | Batch 220 | Loss 0.42353: 100%|██████████| 229/229 [00:02<00:00, 113.84it/s]
Validating: Epoch 013 | Batch 050 | Loss 0.60190: 100%|██████████| 58/58 [00:00<00:00, 95.13it/s] 
Training: Epoch 014 | Batch 220 | Loss 0.42911: 100%|██████████| 229/229 [00:01<00:00, 118.69it/s]
Validating: Epoch 014 | Batch 050 | Loss 0.86809: 100%|██████████| 58/58 [00:00<00:00, 91.52it/s] 
Training: Epoch 015 | Batch 220 | Loss 0.46147: 100%|██████████| 229/229 [00:01<00:00, 114.79it/s]
Validating: Epoch 015 | Batch 050 | Loss 0.66095: 100%|██████████| 58/58 [00:00<00:00, 96.82it/s] 
Training: Epoch 016 | Batch 220 | Loss 0.41228: 100%|██████████| 229/229 [00:02<00:00, 112.73it/s]
Validating: Epoch 016 | Batch 050 | Loss 0.55678: 100%|██████████| 58/58 [00:00<00:00, 97.73it/s] 
Training: Epoch 017 | Batch 220 | Loss 0.33517: 100%|██████████| 229/229 [00:01<00:00, 119.64it/s]
Validating: Epoch 017 | Batch 050 | Loss 0.52528: 100%|██████████| 58/58 [00:00<00:00, 92.97it/s] 
Training: Epoch 018 | Batch 220 | Loss 0.34203: 100%|██████████| 229/229 [00:01<00:00, 116.44it/s]
Validating: Epoch 018 | Batch 050 | Loss 0.59155: 100%|██████████| 58/58 [00:00<00:00, 90.86it/s] 
Training: Epoch 019 | Batch 220 | Loss 0.39919: 100%|██████████| 229/229 [00:02<00:00, 111.50it/s]
Validating: Epoch 019 | Batch 050 | Loss 0.77194: 100%|██████████| 58/58 [00:00<00:00, 97.35it/s] 
Training: Epoch 020 | Batch 220 | Loss 0.32870: 100%|██████████| 229/229 [00:02<00:00, 111.66it/s]
Validating: Epoch 020 | Batch 050 | Loss 0.93120: 100%|██████████| 58/58 [00:00<00:00, 94.10it/s] 
Training: Epoch 021 | Batch 220 | Loss 0.35479: 100%|██████████| 229/229 [00:01<00:00, 119.43it/s]
Validating: Epoch 021 | Batch 050 | Loss 0.55031: 100%|██████████| 58/58 [00:00<00:00, 92.79it/s] 
Training: Epoch 022 | Batch 220 | Loss 0.44557: 100%|██████████| 229/229 [00:02<00:00, 114.20it/s]
Validating: Epoch 022 | Batch 050 | Loss 0.58542: 100%|██████████| 58/58 [00:00<00:00, 95.35it/s] 
Training: Epoch 023 | Batch 220 | Loss 0.26143: 100%|██████████| 229/229 [00:01<00:00, 116.78it/s]
Validating: Epoch 023 | Batch 050 | Loss 0.54384: 100%|██████████| 58/58 [00:00<00:00, 98.52it/s] 
Training: Epoch 024 | Batch 220 | Loss 0.30785: 100%|██████████| 229/229 [00:02<00:00, 113.50it/s]
Validating: Epoch 024 | Batch 050 | Loss 0.47734: 100%|██████████| 58/58 [00:00<00:00, 88.29it/s] 
Training: Epoch 025 | Batch 220 | Loss 0.25341: 100%|██████████| 229/229 [00:01<00:00, 120.69it/s]
Validating: Epoch 025 | Batch 050 | Loss 0.58586: 100%|██████████| 58/58 [00:00<00:00, 94.59it/s] 
Training: Epoch 026 | Batch 220 | Loss 0.24008: 100%|██████████| 229/229 [00:01<00:00, 119.78it/s]
Validating: Epoch 026 | Batch 050 | Loss 0.68904: 100%|██████████| 58/58 [00:00<00:00, 93.41it/s] 
Training: Epoch 027 | Batch 220 | Loss 0.28806: 100%|██████████| 229/229 [00:02<00:00, 111.17it/s]
Validating: Epoch 027 | Batch 050 | Loss 0.36970: 100%|██████████| 58/58 [00:00<00:00, 99.22it/s] 
Training: Epoch 028 | Batch 220 | Loss 0.28831: 100%|██████████| 229/229 [00:02<00:00, 112.44it/s]
Validating: Epoch 028 | Batch 050 | Loss 0.49937: 100%|██████████| 58/58 [00:00<00:00, 101.93it/s]
Training: Epoch 029 | Batch 220 | Loss 0.29348: 100%|██████████| 229/229 [00:02<00:00, 113.51it/s]
Validating: Epoch 029 | Batch 050 | Loss 0.57052: 100%|██████████| 58/58 [00:00<00:00, 88.85it/s] 
Training: Epoch 030 | Batch 220 | Loss 0.27490: 100%|██████████| 229/229 [00:01<00:00, 115.01it/s]
Validating: Epoch 030 | Batch 050 | Loss 0.42811: 100%|██████████| 58/58 [00:00<00:00, 93.26it/s] 
Training: Epoch 031 | Batch 220 | Loss 0.18704: 100%|██████████| 229/229 [00:01<00:00, 118.64it/s]
Validating: Epoch 031 | Batch 050 | Loss 0.59890: 100%|██████████| 58/58 [00:00<00:00, 90.17it/s] 
Training: Epoch 032 | Batch 220 | Loss 0.23147: 100%|██████████| 229/229 [00:01<00:00, 115.33it/s]
Validating: Epoch 032 | Batch 050 | Loss 0.61284: 100%|██████████| 58/58 [00:00<00:00, 95.02it/s] 
Training: Epoch 033 | Batch 220 | Loss 0.23777: 100%|██████████| 229/229 [00:01<00:00, 115.99it/s]
Validating: Epoch 033 | Batch 050 | Loss 0.32755: 100%|██████████| 58/58 [00:00<00:00, 91.84it/s] 
Training: Epoch 034 | Batch 220 | Loss 0.25664: 100%|██████████| 229/229 [00:01<00:00, 114.75it/s]
Validating: Epoch 034 | Batch 050 | Loss 0.63775: 100%|██████████| 58/58 [00:00<00:00, 89.82it/s] 
Training: Epoch 035 | Batch 220 | Loss 0.20385: 100%|██████████| 229/229 [00:02<00:00, 112.59it/s]
Validating: Epoch 035 | Batch 050 | Loss 0.47923: 100%|██████████| 58/58 [00:00<00:00, 94.60it/s] 
Training: Epoch 036 | Batch 220 | Loss 0.28836: 100%|██████████| 229/229 [00:01<00:00, 116.52it/s]
Validating: Epoch 036 | Batch 050 | Loss 0.49524: 100%|██████████| 58/58 [00:00<00:00, 92.67it/s] 
Training: Epoch 037 | Batch 220 | Loss 0.21064: 100%|██████████| 229/229 [00:02<00:00, 109.99it/s]
Validating: Epoch 037 | Batch 050 | Loss 0.62501: 100%|██████████| 58/58 [00:00<00:00, 96.58it/s] 
Training: Epoch 038 | Batch 220 | Loss 0.25606: 100%|██████████| 229/229 [00:02<00:00, 113.98it/s]
Validating: Epoch 038 | Batch 050 | Loss 0.48557: 100%|██████████| 58/58 [00:00<00:00, 94.32it/s] 
Training: Epoch 039 | Batch 220 | Loss 0.28714: 100%|██████████| 229/229 [00:02<00:00, 113.95it/s]
Validating: Epoch 039 | Batch 050 | Loss 0.86517: 100%|██████████| 58/58 [00:00<00:00, 89.90it/s] 
Training: Epoch 040 | Batch 220 | Loss 0.27644: 100%|██████████| 229/229 [00:02<00:00, 114.46it/s]
Validating: Epoch 040 | Batch 050 | Loss 0.79127: 100%|██████████| 58/58 [00:00<00:00, 95.65it/s] 
Training: Epoch 041 | Batch 220 | Loss 0.19688: 100%|██████████| 229/229 [00:01<00:00, 115.28it/s]
Validating: Epoch 041 | Batch 050 | Loss 0.36012: 100%|██████████| 58/58 [00:00<00:00, 94.56it/s] 
Training: Epoch 042 | Batch 220 | Loss 0.16332: 100%|██████████| 229/229 [00:02<00:00, 113.81it/s]
Validating: Epoch 042 | Batch 050 | Loss 0.47882: 100%|██████████| 58/58 [00:00<00:00, 93.07it/s] 
Training: Epoch 043 | Batch 220 | Loss 0.26740: 100%|██████████| 229/229 [00:01<00:00, 115.42it/s]
Validating: Epoch 043 | Batch 050 | Loss 0.89461: 100%|██████████| 58/58 [00:00<00:00, 93.64it/s] 
Training: Epoch 044 | Batch 220 | Loss 0.16535: 100%|██████████| 229/229 [00:01<00:00, 118.07it/s]
Validating: Epoch 044 | Batch 050 | Loss 0.32351: 100%|██████████| 58/58 [00:00<00:00, 92.46it/s] 
Training: Epoch 045 | Batch 220 | Loss 0.19325: 100%|██████████| 229/229 [00:02<00:00, 112.40it/s]
Validating: Epoch 045 | Batch 050 | Loss 0.43620: 100%|██████████| 58/58 [00:00<00:00, 87.57it/s] 
Training: Epoch 046 | Batch 220 | Loss 0.17916: 100%|██████████| 229/229 [00:01<00:00, 118.26it/s]
Validating: Epoch 046 | Batch 050 | Loss 0.49224: 100%|██████████| 58/58 [00:00<00:00, 96.95it/s] 
Training: Epoch 047 | Batch 220 | Loss 0.14384: 100%|██████████| 229/229 [00:02<00:00, 111.05it/s]
Validating: Epoch 047 | Batch 050 | Loss 0.58357: 100%|██████████| 58/58 [00:00<00:00, 90.15it/s] 
Training: Epoch 048 | Batch 220 | Loss 0.23456: 100%|██████████| 229/229 [00:02<00:00, 109.10it/s]
Validating: Epoch 048 | Batch 050 | Loss 0.42516: 100%|██████████| 58/58 [00:00<00:00, 97.82it/s] 
Training: Epoch 049 | Batch 220 | Loss 0.22493: 100%|██████████| 229/229 [00:02<00:00, 111.67it/s]
Validating: Epoch 049 | Batch 050 | Loss 0.80351: 100%|██████████| 58/58 [00:00<00:00, 100.66it/s]
Training: Epoch 050 | Batch 220 | Loss 0.19933: 100%|██████████| 229/229 [00:02<00:00, 113.95it/s]
Validating: Epoch 050 | Batch 050 | Loss 0.52819: 100%|██████████| 58/58 [00:00<00:00, 97.74it/s] 
Training: Epoch 051 | Batch 220 | Loss 0.21701: 100%|██████████| 229/229 [00:02<00:00, 106.96it/s]
Validating: Epoch 051 | Batch 050 | Loss 0.29599: 100%|██████████| 58/58 [00:00<00:00, 99.22it/s] 
Training: Epoch 052 | Batch 220 | Loss 0.18086: 100%|██████████| 229/229 [00:01<00:00, 118.68it/s]
Validating: Epoch 052 | Batch 050 | Loss 0.28437: 100%|██████████| 58/58 [00:00<00:00, 93.44it/s] 
Training: Epoch 053 | Batch 220 | Loss 0.17774: 100%|██████████| 229/229 [00:01<00:00, 116.72it/s]
Validating: Epoch 053 | Batch 050 | Loss 0.42999: 100%|██████████| 58/58 [00:00<00:00, 90.85it/s] 
Training: Epoch 054 | Batch 220 | Loss 0.18556: 100%|██████████| 229/229 [00:02<00:00, 112.85it/s]
Validating: Epoch 054 | Batch 050 | Loss 0.29965: 100%|██████████| 58/58 [00:00<00:00, 91.68it/s] 
Training: Epoch 055 | Batch 220 | Loss 0.15890: 100%|██████████| 229/229 [00:01<00:00, 118.92it/s]
Validating: Epoch 055 | Batch 050 | Loss 0.66625: 100%|██████████| 58/58 [00:00<00:00, 85.80it/s] 
Training: Epoch 056 | Batch 220 | Loss 0.16146: 100%|██████████| 229/229 [00:01<00:00, 116.89it/s]
Validating: Epoch 056 | Batch 050 | Loss 0.38084: 100%|██████████| 58/58 [00:00<00:00, 88.60it/s] 
Training: Epoch 057 | Batch 220 | Loss 0.12205: 100%|██████████| 229/229 [00:02<00:00, 113.11it/s]
Validating: Epoch 057 | Batch 050 | Loss 0.34619: 100%|██████████| 58/58 [00:00<00:00, 100.02it/s]
Training: Epoch 058 | Batch 220 | Loss 0.17817: 100%|██████████| 229/229 [00:02<00:00, 108.56it/s]
Validating: Epoch 058 | Batch 050 | Loss 0.62793: 100%|██████████| 58/58 [00:00<00:00, 93.09it/s] 
Training: Epoch 059 | Batch 220 | Loss 0.11059: 100%|██████████| 229/229 [00:02<00:00, 109.48it/s]
Validating: Epoch 059 | Batch 050 | Loss 0.56963: 100%|██████████| 58/58 [00:00<00:00, 96.67it/s] 
Training: Epoch 060 | Batch 220 | Loss 0.17391: 100%|██████████| 229/229 [00:02<00:00, 107.42it/s]
Validating: Epoch 060 | Batch 050 | Loss 0.44324: 100%|██████████| 58/58 [00:00<00:00, 99.69it/s] 
Training: Epoch 061 | Batch 220 | Loss 0.18549: 100%|██████████| 229/229 [00:01<00:00, 117.64it/s]
Validating: Epoch 061 | Batch 050 | Loss 0.42772: 100%|██████████| 58/58 [00:00<00:00, 96.31it/s] 
Training: Epoch 062 | Batch 220 | Loss 0.16231: 100%|██████████| 229/229 [00:01<00:00, 119.83it/s]
Validating: Epoch 062 | Batch 050 | Loss 0.23985: 100%|██████████| 58/58 [00:00<00:00, 88.82it/s] 
Training: Epoch 063 | Batch 220 | Loss 0.11181: 100%|██████████| 229/229 [00:02<00:00, 112.11it/s]
Validating: Epoch 063 | Batch 050 | Loss 0.20190: 100%|██████████| 58/58 [00:00<00:00, 94.53it/s] 
Training: Epoch 064 | Batch 220 | Loss 0.12812: 100%|██████████| 229/229 [00:02<00:00, 106.26it/s]
Validating: Epoch 064 | Batch 050 | Loss 0.35758: 100%|██████████| 58/58 [00:00<00:00, 97.87it/s] 
Training: Epoch 065 | Batch 220 | Loss 0.15992: 100%|██████████| 229/229 [00:01<00:00, 116.80it/s]
Validating: Epoch 065 | Batch 050 | Loss 0.26082: 100%|██████████| 58/58 [00:00<00:00, 92.83it/s] 
Training: Epoch 066 | Batch 220 | Loss 0.15630: 100%|██████████| 229/229 [00:01<00:00, 116.53it/s]
Validating: Epoch 066 | Batch 050 | Loss 0.40315: 100%|██████████| 58/58 [00:00<00:00, 94.94it/s] 
Training: Epoch 067 | Batch 220 | Loss 0.16775: 100%|██████████| 229/229 [00:02<00:00, 113.82it/s]
Validating: Epoch 067 | Batch 050 | Loss 0.47783: 100%|██████████| 58/58 [00:00<00:00, 91.05it/s] 
Training: Epoch 068 | Batch 220 | Loss 0.14555: 100%|██████████| 229/229 [00:01<00:00, 114.54it/s]
Validating: Epoch 068 | Batch 050 | Loss 0.23319: 100%|██████████| 58/58 [00:00<00:00, 92.69it/s] 
Training: Epoch 069 | Batch 220 | Loss 0.13070: 100%|██████████| 229/229 [00:01<00:00, 116.84it/s]
Validating: Epoch 069 | Batch 050 | Loss 0.35627: 100%|██████████| 58/58 [00:00<00:00, 98.39it/s] 
Training: Epoch 070 | Batch 220 | Loss 0.09891: 100%|██████████| 229/229 [00:02<00:00, 112.53it/s]
Validating: Epoch 070 | Batch 050 | Loss 0.28625: 100%|██████████| 58/58 [00:00<00:00, 97.39it/s] 
Training: Epoch 071 | Batch 220 | Loss 0.14103: 100%|██████████| 229/229 [00:02<00:00, 109.95it/s]
Validating: Epoch 071 | Batch 050 | Loss 0.27902: 100%|██████████| 58/58 [00:00<00:00, 94.03it/s] 
Training: Epoch 072 | Batch 220 | Loss 0.15015: 100%|██████████| 229/229 [00:01<00:00, 116.49it/s]
Validating: Epoch 072 | Batch 050 | Loss 0.71938: 100%|██████████| 58/58 [00:00<00:00, 90.34it/s] 
Training: Epoch 073 | Batch 220 | Loss 0.16925: 100%|██████████| 229/229 [00:01<00:00, 118.12it/s]
Validating: Epoch 073 | Batch 050 | Loss 0.42476: 100%|██████████| 58/58 [00:00<00:00, 93.61it/s] 
Training: Epoch 074 | Batch 220 | Loss 0.07816: 100%|██████████| 229/229 [00:01<00:00, 114.57it/s]
Validating: Epoch 074 | Batch 050 | Loss 0.32172: 100%|██████████| 58/58 [00:00<00:00, 93.44it/s] 
Training: Epoch 075 | Batch 220 | Loss 0.10649: 100%|██████████| 229/229 [00:02<00:00, 111.36it/s]
Validating: Epoch 075 | Batch 050 | Loss 0.31943: 100%|██████████| 58/58 [00:00<00:00, 101.64it/s]
Training: Epoch 076 | Batch 220 | Loss 0.15616: 100%|██████████| 229/229 [00:01<00:00, 115.57it/s]
Validating: Epoch 076 | Batch 050 | Loss 0.29466: 100%|██████████| 58/58 [00:00<00:00, 92.74it/s] 
Training: Epoch 077 | Batch 220 | Loss 0.10018: 100%|██████████| 229/229 [00:01<00:00, 120.11it/s]
Validating: Epoch 077 | Batch 050 | Loss 0.67675: 100%|██████████| 58/58 [00:00<00:00, 95.38it/s] 
Training: Epoch 078 | Batch 220 | Loss 0.14370: 100%|██████████| 229/229 [00:02<00:00, 102.62it/s]
Validating: Epoch 078 | Batch 050 | Loss 0.69079: 100%|██████████| 58/58 [00:00<00:00, 83.73it/s] 
Training: Epoch 079 | Batch 220 | Loss 0.08174: 100%|██████████| 229/229 [00:01<00:00, 114.91it/s]
Validating: Epoch 079 | Batch 050 | Loss 0.61443: 100%|██████████| 58/58 [00:00<00:00, 93.22it/s] 
Training: Epoch 080 | Batch 220 | Loss 0.16060: 100%|██████████| 229/229 [00:01<00:00, 117.87it/s]
Validating: Epoch 080 | Batch 050 | Loss 0.70900: 100%|██████████| 58/58 [00:00<00:00, 92.67it/s] 
Training: Epoch 081 | Batch 220 | Loss 0.12201: 100%|██████████| 229/229 [00:02<00:00, 112.96it/s]
Validating: Epoch 081 | Batch 050 | Loss 0.28214: 100%|██████████| 58/58 [00:00<00:00, 90.05it/s] 
Training: Epoch 082 | Batch 220 | Loss 0.13080: 100%|██████████| 229/229 [00:01<00:00, 117.45it/s]
Validating: Epoch 082 | Batch 050 | Loss 0.37571: 100%|██████████| 58/58 [00:00<00:00, 87.35it/s]
Training: Epoch 083 | Batch 220 | Loss 0.09336: 100%|██████████| 229/229 [00:02<00:00, 111.51it/s]
Validating: Epoch 083 | Batch 050 | Loss 0.27782: 100%|██████████| 58/58 [00:00<00:00, 103.16it/s]
Training: Epoch 084 | Batch 220 | Loss 0.11103: 100%|██████████| 229/229 [00:02<00:00, 109.59it/s]
Validating: Epoch 084 | Batch 050 | Loss 0.48066: 100%|██████████| 58/58 [00:00<00:00, 91.93it/s] 
Training: Epoch 085 | Batch 220 | Loss 0.09368: 100%|██████████| 229/229 [00:01<00:00, 118.51it/s]
Validating: Epoch 085 | Batch 050 | Loss 0.45599: 100%|██████████| 58/58 [00:00<00:00, 95.92it/s] 
Training: Epoch 086 | Batch 220 | Loss 0.09439: 100%|██████████| 229/229 [00:02<00:00, 114.04it/s]
Validating: Epoch 086 | Batch 050 | Loss 0.20643: 100%|██████████| 58/58 [00:00<00:00, 93.04it/s] 
Training: Epoch 087 | Batch 220 | Loss 0.08616: 100%|██████████| 229/229 [00:02<00:00, 112.32it/s]
Validating: Epoch 087 | Batch 050 | Loss 0.56362: 100%|██████████| 58/58 [00:00<00:00, 100.45it/s]
Training: Epoch 088 | Batch 220 | Loss 0.11933: 100%|██████████| 229/229 [00:01<00:00, 117.83it/s]
Validating: Epoch 088 | Batch 050 | Loss 0.26409: 100%|██████████| 58/58 [00:00<00:00, 94.85it/s] 
Training: Epoch 089 | Batch 220 | Loss 0.08965: 100%|██████████| 229/229 [00:01<00:00, 115.83it/s]
Validating: Epoch 089 | Batch 050 | Loss 0.36822: 100%|██████████| 58/58 [00:00<00:00, 91.00it/s] 
Training: Epoch 090 | Batch 220 | Loss 0.08977: 100%|██████████| 229/229 [00:01<00:00, 118.34it/s]
Validating: Epoch 090 | Batch 050 | Loss 0.41822: 100%|██████████| 58/58 [00:00<00:00, 97.34it/s] 
Training: Epoch 091 | Batch 220 | Loss 0.10854: 100%|██████████| 229/229 [00:02<00:00, 111.05it/s]
Validating: Epoch 091 | Batch 050 | Loss 0.57338: 100%|██████████| 58/58 [00:00<00:00, 90.79it/s] 
Training: Epoch 092 | Batch 220 | Loss 0.12473: 100%|██████████| 229/229 [00:01<00:00, 116.04it/s]
Validating: Epoch 092 | Batch 050 | Loss 0.43673: 100%|██████████| 58/58 [00:00<00:00, 99.91it/s] 
Training: Epoch 093 | Batch 220 | Loss 0.07517: 100%|██████████| 229/229 [00:02<00:00, 112.11it/s]
Validating: Epoch 093 | Batch 050 | Loss 0.46235: 100%|██████████| 58/58 [00:00<00:00, 96.63it/s] 
Training: Epoch 094 | Batch 220 | Loss 0.07097: 100%|██████████| 229/229 [00:02<00:00, 114.02it/s]
Validating: Epoch 094 | Batch 050 | Loss 0.40843: 100%|██████████| 58/58 [00:00<00:00, 91.90it/s] 
Training: Epoch 095 | Batch 220 | Loss 0.07136: 100%|██████████| 229/229 [00:01<00:00, 115.43it/s]
Validating: Epoch 095 | Batch 050 | Loss 0.32397: 100%|██████████| 58/58 [00:00<00:00, 100.78it/s]
Training: Epoch 096 | Batch 220 | Loss 0.05080: 100%|██████████| 229/229 [00:01<00:00, 118.20it/s]
Validating: Epoch 096 | Batch 050 | Loss 0.31415: 100%|██████████| 58/58 [00:00<00:00, 94.06it/s] 
Training: Epoch 097 | Batch 220 | Loss 0.12279: 100%|██████████| 229/229 [00:01<00:00, 116.36it/s]
Validating: Epoch 097 | Batch 050 | Loss 0.36568: 100%|██████████| 58/58 [00:00<00:00, 89.03it/s] 
Training: Epoch 098 | Batch 220 | Loss 0.11050: 100%|██████████| 229/229 [00:01<00:00, 115.91it/s]
Validating: Epoch 098 | Batch 050 | Loss 0.32316: 100%|██████████| 58/58 [00:00<00:00, 99.56it/s] 
Training: Epoch 099 | Batch 220 | Loss 0.08862: 100%|██████████| 229/229 [00:02<00:00, 110.04it/s]
Validating: Epoch 099 | Batch 050 | Loss 0.64882: 100%|██████████| 58/58 [00:00<00:00, 98.85it/s] 
Training: Epoch 100 | Batch 220 | Loss 0.09347: 100%|██████████| 229/229 [00:01<00:00, 115.99it/s]
Validating: Epoch 100 | Batch 050 | Loss 0.63111: 100%|██████████| 58/58 [00:00<00:00, 96.50it/s] 
Training: Epoch 101 | Batch 220 | Loss 0.09279: 100%|██████████| 229/229 [00:01<00:00, 116.58it/s]
Validating: Epoch 101 | Batch 050 | Loss 0.32720: 100%|██████████| 58/58 [00:00<00:00, 89.55it/s] 
Training: Epoch 102 | Batch 220 | Loss 0.08596: 100%|██████████| 229/229 [00:02<00:00, 110.36it/s]
Validating: Epoch 102 | Batch 050 | Loss 0.28475: 100%|██████████| 58/58 [00:00<00:00, 92.06it/s] 
Training: Epoch 103 | Batch 220 | Loss 0.04847: 100%|██████████| 229/229 [00:02<00:00, 113.40it/s]
Validating: Epoch 103 | Batch 050 | Loss 0.38569: 100%|██████████| 58/58 [00:00<00:00, 99.17it/s] 
Training: Epoch 104 | Batch 220 | Loss 0.11426: 100%|██████████| 229/229 [00:01<00:00, 114.96it/s]
Validating: Epoch 104 | Batch 050 | Loss 0.51034: 100%|██████████| 58/58 [00:00<00:00, 96.11it/s] 
Training: Epoch 105 | Batch 220 | Loss 0.08364: 100%|██████████| 229/229 [00:02<00:00, 109.52it/s]
Validating: Epoch 105 | Batch 050 | Loss 0.31246: 100%|██████████| 58/58 [00:00<00:00, 90.59it/s] 
Training: Epoch 106 | Batch 220 | Loss 0.09312: 100%|██████████| 229/229 [00:02<00:00, 111.14it/s]
Validating: Epoch 106 | Batch 050 | Loss 0.35399: 100%|██████████| 58/58 [00:00<00:00, 88.33it/s] 
Training: Epoch 107 | Batch 220 | Loss 0.05883: 100%|██████████| 229/229 [00:02<00:00, 113.46it/s]
Validating: Epoch 107 | Batch 050 | Loss 0.26199: 100%|██████████| 58/58 [00:00<00:00, 90.56it/s] 
Training: Epoch 108 | Batch 220 | Loss 0.06373: 100%|██████████| 229/229 [00:02<00:00, 113.65it/s]
Validating: Epoch 108 | Batch 050 | Loss 0.34874: 100%|██████████| 58/58 [00:00<00:00, 98.74it/s] 
Training: Epoch 109 | Batch 220 | Loss 0.07279: 100%|██████████| 229/229 [00:02<00:00, 113.83it/s]
Validating: Epoch 109 | Batch 050 | Loss 0.31779: 100%|██████████| 58/58 [00:00<00:00, 93.38it/s] 
Training: Epoch 110 | Batch 220 | Loss 0.12144: 100%|██████████| 229/229 [00:02<00:00, 113.52it/s]
Validating: Epoch 110 | Batch 050 | Loss 0.34560: 100%|██████████| 58/58 [00:00<00:00, 98.10it/s] 
Training: Epoch 111 | Batch 220 | Loss 0.07703: 100%|██████████| 229/229 [00:01<00:00, 118.37it/s]
Validating: Epoch 111 | Batch 050 | Loss 0.30435: 100%|██████████| 58/58 [00:00<00:00, 88.15it/s] 
Training: Epoch 112 | Batch 220 | Loss 0.07745: 100%|██████████| 229/229 [00:01<00:00, 115.85it/s]
Validating: Epoch 112 | Batch 050 | Loss 0.40456: 100%|██████████| 58/58 [00:00<00:00, 97.31it/s] 
Training: Epoch 113 | Batch 220 | Loss 0.05789: 100%|██████████| 229/229 [00:02<00:00, 111.45it/s]
Validating: Epoch 113 | Batch 050 | Loss 0.51495: 100%|██████████| 58/58 [00:00<00:00, 100.82it/s]
Training: Epoch 114 | Batch 220 | Loss 0.07520: 100%|██████████| 229/229 [00:01<00:00, 117.09it/s]
Validating: Epoch 114 | Batch 050 | Loss 0.23212: 100%|██████████| 58/58 [00:00<00:00, 98.85it/s] 
Training: Epoch 115 | Batch 220 | Loss 0.03026: 100%|██████████| 229/229 [00:01<00:00, 116.21it/s]
Validating: Epoch 115 | Batch 050 | Loss 0.30960: 100%|██████████| 58/58 [00:00<00:00, 99.96it/s] 
Training: Epoch 116 | Batch 220 | Loss 0.04869: 100%|██████████| 229/229 [00:01<00:00, 115.51it/s]
Validating: Epoch 116 | Batch 050 | Loss 0.35533: 100%|██████████| 58/58 [00:00<00:00, 96.82it/s] 
Training: Epoch 117 | Batch 220 | Loss 0.06862: 100%|██████████| 229/229 [00:02<00:00, 112.75it/s]
Validating: Epoch 117 | Batch 050 | Loss 0.35848: 100%|██████████| 58/58 [00:00<00:00, 93.64it/s] 
Training: Epoch 118 | Batch 220 | Loss 0.06970: 100%|██████████| 229/229 [00:01<00:00, 119.58it/s]
Validating: Epoch 118 | Batch 050 | Loss 0.23656: 100%|██████████| 58/58 [00:00<00:00, 93.69it/s] 
Training: Epoch 119 | Batch 220 | Loss 0.11605: 100%|██████████| 229/229 [00:02<00:00, 113.26it/s]
Validating: Epoch 119 | Batch 050 | Loss 0.56982: 100%|██████████| 58/58 [00:00<00:00, 93.97it/s] 
Training: Epoch 120 | Batch 220 | Loss 0.08512: 100%|██████████| 229/229 [00:02<00:00, 105.20it/s]
Validating: Epoch 120 | Batch 050 | Loss 0.53760: 100%|██████████| 58/58 [00:00<00:00, 93.51it/s] 
Training: Epoch 121 | Batch 220 | Loss 0.06769: 100%|██████████| 229/229 [00:02<00:00, 110.90it/s]
Validating: Epoch 121 | Batch 050 | Loss 0.34573: 100%|██████████| 58/58 [00:00<00:00, 96.81it/s] 
Training: Epoch 122 | Batch 220 | Loss 0.03783: 100%|██████████| 229/229 [00:02<00:00, 109.37it/s]
Validating: Epoch 122 | Batch 050 | Loss 0.37213: 100%|██████████| 58/58 [00:00<00:00, 89.94it/s] 
Training: Epoch 123 | Batch 220 | Loss 0.02579: 100%|██████████| 229/229 [00:01<00:00, 117.34it/s]
Validating: Epoch 123 | Batch 050 | Loss 0.32647: 100%|██████████| 58/58 [00:00<00:00, 95.46it/s] 
Training: Epoch 124 | Batch 220 | Loss 0.07568: 100%|██████████| 229/229 [00:02<00:00, 112.19it/s]
Validating: Epoch 124 | Batch 050 | Loss 0.28735: 100%|██████████| 58/58 [00:00<00:00, 97.88it/s] 
Training: Epoch 125 | Batch 220 | Loss 0.12033: 100%|██████████| 229/229 [00:02<00:00, 112.17it/s]
Validating: Epoch 125 | Batch 050 | Loss 0.26395: 100%|██████████| 58/58 [00:00<00:00, 93.42it/s] 
Training: Epoch 126 | Batch 220 | Loss 0.04857: 100%|██████████| 229/229 [00:02<00:00, 111.11it/s]
Validating: Epoch 126 | Batch 050 | Loss 0.22653: 100%|██████████| 58/58 [00:00<00:00, 87.21it/s] 
Training: Epoch 127 | Batch 220 | Loss 0.08698: 100%|██████████| 229/229 [00:01<00:00, 117.96it/s]
Validating: Epoch 127 | Batch 050 | Loss 0.34523: 100%|██████████| 58/58 [00:00<00:00, 95.34it/s] 
Training: Epoch 128 | Batch 220 | Loss 0.04496: 100%|██████████| 229/229 [00:02<00:00, 113.30it/s]
Validating: Epoch 128 | Batch 050 | Loss 0.44855: 100%|██████████| 58/58 [00:00<00:00, 97.21it/s] 
Training: Epoch 129 | Batch 220 | Loss 0.06313: 100%|██████████| 229/229 [00:02<00:00, 109.48it/s]
Validating: Epoch 129 | Batch 050 | Loss 0.41143: 100%|██████████| 58/58 [00:00<00:00, 89.78it/s] 
Training: Epoch 130 | Batch 220 | Loss 0.06661: 100%|██████████| 229/229 [00:02<00:00, 107.81it/s]
Validating: Epoch 130 | Batch 050 | Loss 0.53394: 100%|██████████| 58/58 [00:00<00:00, 98.01it/s] 
Training: Epoch 131 | Batch 220 | Loss 0.06165: 100%|██████████| 229/229 [00:01<00:00, 119.90it/s]
Validating: Epoch 131 | Batch 050 | Loss 0.32263: 100%|██████████| 58/58 [00:00<00:00, 97.64it/s] 
Training: Epoch 132 | Batch 220 | Loss 0.05217: 100%|██████████| 229/229 [00:01<00:00, 116.07it/s]
Validating: Epoch 132 | Batch 050 | Loss 0.23009: 100%|██████████| 58/58 [00:00<00:00, 98.81it/s] 
Training: Epoch 133 | Batch 220 | Loss 0.07725: 100%|██████████| 229/229 [00:01<00:00, 114.96it/s]
Validating: Epoch 133 | Batch 050 | Loss 0.42763: 100%|██████████| 58/58 [00:00<00:00, 93.48it/s] 
Training: Epoch 134 | Batch 220 | Loss 0.06948: 100%|██████████| 229/229 [00:02<00:00, 111.63it/s]
Validating: Epoch 134 | Batch 050 | Loss 0.38027: 100%|██████████| 58/58 [00:00<00:00, 94.03it/s] 
Training: Epoch 135 | Batch 220 | Loss 0.05445: 100%|██████████| 229/229 [00:01<00:00, 119.23it/s]
Validating: Epoch 135 | Batch 050 | Loss 0.23489: 100%|██████████| 58/58 [00:00<00:00, 90.24it/s] 
Training: Epoch 136 | Batch 220 | Loss 0.03632: 100%|██████████| 229/229 [00:01<00:00, 116.76it/s]
Validating: Epoch 136 | Batch 050 | Loss 0.41157: 100%|██████████| 58/58 [00:00<00:00, 95.02it/s] 
Training: Epoch 137 | Batch 220 | Loss 0.07137: 100%|██████████| 229/229 [00:01<00:00, 116.18it/s]
Validating: Epoch 137 | Batch 050 | Loss 0.25368: 100%|██████████| 58/58 [00:00<00:00, 86.91it/s] 
Training: Epoch 138 | Batch 220 | Loss 0.06197: 100%|██████████| 229/229 [00:02<00:00, 114.09it/s]
Validating: Epoch 138 | Batch 050 | Loss 1.03816: 100%|██████████| 58/58 [00:00<00:00, 97.48it/s] 
Training: Epoch 139 | Batch 220 | Loss 0.04232: 100%|██████████| 229/229 [00:02<00:00, 113.09it/s]
Validating: Epoch 139 | Batch 050 | Loss 0.26606: 100%|██████████| 58/58 [00:00<00:00, 94.20it/s] 
Training: Epoch 140 | Batch 220 | Loss 0.02435: 100%|██████████| 229/229 [00:01<00:00, 117.74it/s]
Validating: Epoch 140 | Batch 050 | Loss 0.37930: 100%|██████████| 58/58 [00:00<00:00, 99.23it/s] 
Training: Epoch 141 | Batch 220 | Loss 0.07725: 100%|██████████| 229/229 [00:02<00:00, 110.90it/s]
Validating: Epoch 141 | Batch 050 | Loss 0.33463: 100%|██████████| 58/58 [00:00<00:00, 94.11it/s] 
Training: Epoch 142 | Batch 220 | Loss 0.05313: 100%|██████████| 229/229 [00:02<00:00, 110.57it/s]
Validating: Epoch 142 | Batch 050 | Loss 0.26884: 100%|██████████| 58/58 [00:00<00:00, 97.57it/s] 
Training: Epoch 143 | Batch 220 | Loss 0.06533: 100%|██████████| 229/229 [00:02<00:00, 113.75it/s]
Validating: Epoch 143 | Batch 050 | Loss 0.32119: 100%|██████████| 58/58 [00:00<00:00, 97.17it/s] 
Training: Epoch 144 | Batch 220 | Loss 0.03395: 100%|██████████| 229/229 [00:01<00:00, 116.88it/s]
Validating: Epoch 144 | Batch 050 | Loss 0.35495: 100%|██████████| 58/58 [00:00<00:00, 94.42it/s] 
Training: Epoch 145 | Batch 220 | Loss 0.05752: 100%|██████████| 229/229 [00:02<00:00, 111.80it/s]
Validating: Epoch 145 | Batch 050 | Loss 0.38517: 100%|██████████| 58/58 [00:00<00:00, 97.21it/s] 
Training: Epoch 146 | Batch 220 | Loss 0.05084: 100%|██████████| 229/229 [00:01<00:00, 119.95it/s]
Validating: Epoch 146 | Batch 050 | Loss 0.24071: 100%|██████████| 58/58 [00:00<00:00, 95.96it/s] 
Training: Epoch 147 | Batch 220 | Loss 0.06235: 100%|██████████| 229/229 [00:01<00:00, 118.37it/s]
Validating: Epoch 147 | Batch 050 | Loss 0.48469: 100%|██████████| 58/58 [00:00<00:00, 95.08it/s] 
Training: Epoch 148 | Batch 220 | Loss 0.06149: 100%|██████████| 229/229 [00:02<00:00, 113.80it/s]
Validating: Epoch 148 | Batch 050 | Loss 0.36924: 100%|██████████| 58/58 [00:00<00:00, 93.07it/s] 
Training: Epoch 149 | Batch 220 | Loss 0.05114: 100%|██████████| 229/229 [00:02<00:00, 109.85it/s]
Validating: Epoch 149 | Batch 050 | Loss 0.25747: 100%|██████████| 58/58 [00:00<00:00, 88.88it/s] 
Training: Epoch 150 | Batch 220 | Loss 0.07087: 100%|██████████| 229/229 [00:02<00:00, 110.84it/s]
Validating: Epoch 150 | Batch 050 | Loss 0.52985: 100%|██████████| 58/58 [00:00<00:00, 96.86it/s] 
Training: Epoch 151 | Batch 220 | Loss 0.05141: 100%|██████████| 229/229 [00:01<00:00, 115.46it/s]
Validating: Epoch 151 | Batch 050 | Loss 0.31486: 100%|██████████| 58/58 [00:00<00:00, 89.65it/s] 
Training: Epoch 152 | Batch 220 | Loss 0.03508: 100%|██████████| 229/229 [00:01<00:00, 118.97it/s]
Validating: Epoch 152 | Batch 050 | Loss 0.20683: 100%|██████████| 58/58 [00:00<00:00, 95.20it/s] 
Training: Epoch 153 | Batch 220 | Loss 0.04414: 100%|██████████| 229/229 [00:02<00:00, 113.78it/s]
Validating: Epoch 153 | Batch 050 | Loss 0.16849: 100%|██████████| 58/58 [00:00<00:00, 85.89it/s] 
Training: Epoch 154 | Batch 220 | Loss 0.03416: 100%|██████████| 229/229 [00:02<00:00, 113.16it/s]
Validating: Epoch 154 | Batch 050 | Loss 0.23495: 100%|██████████| 58/58 [00:00<00:00, 90.27it/s] 
Training: Epoch 155 | Batch 220 | Loss 0.02561: 100%|██████████| 229/229 [00:02<00:00, 112.66it/s]
Validating: Epoch 155 | Batch 050 | Loss 0.27852: 100%|██████████| 58/58 [00:00<00:00, 97.85it/s] 
Training: Epoch 156 | Batch 220 | Loss 0.02562: 100%|██████████| 229/229 [00:01<00:00, 116.64it/s]
Validating: Epoch 156 | Batch 050 | Loss 0.18572: 100%|██████████| 58/58 [00:00<00:00, 90.62it/s] 
Training: Epoch 157 | Batch 220 | Loss 0.06350: 100%|██████████| 229/229 [00:02<00:00, 110.36it/s]
Validating: Epoch 157 | Batch 050 | Loss 0.36195: 100%|██████████| 58/58 [00:00<00:00, 92.61it/s] 
Training: Epoch 158 | Batch 220 | Loss 0.03839: 100%|██████████| 229/229 [00:01<00:00, 122.48it/s]
Validating: Epoch 158 | Batch 050 | Loss 0.23856: 100%|██████████| 58/58 [00:00<00:00, 90.94it/s] 
Training: Epoch 159 | Batch 220 | Loss 0.05170: 100%|██████████| 229/229 [00:01<00:00, 117.41it/s]
Validating: Epoch 159 | Batch 050 | Loss 0.18276: 100%|██████████| 58/58 [00:00<00:00, 89.09it/s] 
Training: Epoch 160 | Batch 220 | Loss 0.05699: 100%|██████████| 229/229 [00:02<00:00, 110.43it/s]
Validating: Epoch 160 | Batch 050 | Loss 0.21879: 100%|██████████| 58/58 [00:00<00:00, 94.50it/s] 
Training: Epoch 161 | Batch 220 | Loss 0.02999: 100%|██████████| 229/229 [00:02<00:00, 113.59it/s]
Validating: Epoch 161 | Batch 050 | Loss 0.86258: 100%|██████████| 58/58 [00:00<00:00, 99.80it/s] 
Training: Epoch 162 | Batch 220 | Loss 0.06177: 100%|██████████| 229/229 [00:02<00:00, 109.80it/s]
Validating: Epoch 162 | Batch 050 | Loss 0.27693: 100%|██████████| 58/58 [00:00<00:00, 96.57it/s] 
Training: Epoch 163 | Batch 220 | Loss 0.06099: 100%|██████████| 229/229 [00:02<00:00, 113.63it/s]
Validating: Epoch 163 | Batch 050 | Loss 0.39588: 100%|██████████| 58/58 [00:00<00:00, 88.46it/s] 
Training: Epoch 164 | Batch 220 | Loss 0.01970: 100%|██████████| 229/229 [00:02<00:00, 111.72it/s]
Validating: Epoch 164 | Batch 050 | Loss 0.31344: 100%|██████████| 58/58 [00:00<00:00, 87.05it/s] 
Training: Epoch 165 | Batch 220 | Loss 0.01919: 100%|██████████| 229/229 [00:02<00:00, 113.61it/s]
Validating: Epoch 165 | Batch 050 | Loss 0.24385: 100%|██████████| 58/58 [00:00<00:00, 96.96it/s] 
Training: Epoch 166 | Batch 220 | Loss 0.05517: 100%|██████████| 229/229 [00:02<00:00, 113.79it/s]
Validating: Epoch 166 | Batch 050 | Loss 0.48217: 100%|██████████| 58/58 [00:00<00:00, 89.45it/s] 
Training: Epoch 167 | Batch 220 | Loss 0.05209: 100%|██████████| 229/229 [00:02<00:00, 112.41it/s]
Validating: Epoch 167 | Batch 050 | Loss 0.33710: 100%|██████████| 58/58 [00:00<00:00, 87.17it/s] 
Training: Epoch 168 | Batch 220 | Loss 0.11624: 100%|██████████| 229/229 [00:01<00:00, 114.58it/s]
Validating: Epoch 168 | Batch 050 | Loss 0.52150: 100%|██████████| 58/58 [00:00<00:00, 93.62it/s] 
Training: Epoch 169 | Batch 220 | Loss 0.04509: 100%|██████████| 229/229 [00:01<00:00, 119.47it/s]
Validating: Epoch 169 | Batch 050 | Loss 0.22791: 100%|██████████| 58/58 [00:00<00:00, 96.42it/s] 
Training: Epoch 170 | Batch 220 | Loss 0.10050: 100%|██████████| 229/229 [00:02<00:00, 110.89it/s]
Validating: Epoch 170 | Batch 050 | Loss 0.37505: 100%|██████████| 58/58 [00:00<00:00, 87.88it/s] 
Training: Epoch 171 | Batch 220 | Loss 0.03029: 100%|██████████| 229/229 [00:02<00:00, 108.98it/s]
Validating: Epoch 171 | Batch 050 | Loss 0.31687: 100%|██████████| 58/58 [00:00<00:00, 91.95it/s] 
Training: Epoch 172 | Batch 220 | Loss 0.05840: 100%|██████████| 229/229 [00:02<00:00, 112.49it/s]
Validating: Epoch 172 | Batch 050 | Loss 0.24588: 100%|██████████| 58/58 [00:00<00:00, 90.38it/s] 
Training: Epoch 173 | Batch 220 | Loss 0.04614: 100%|██████████| 229/229 [00:02<00:00, 113.41it/s]
Validating: Epoch 173 | Batch 050 | Loss 0.37389: 100%|██████████| 58/58 [00:00<00:00, 95.23it/s] 
Training: Epoch 174 | Batch 220 | Loss 0.02693: 100%|██████████| 229/229 [00:02<00:00, 108.70it/s]
Validating: Epoch 174 | Batch 050 | Loss 0.21408: 100%|██████████| 58/58 [00:00<00:00, 89.83it/s] 
Training: Epoch 175 | Batch 220 | Loss 0.02258: 100%|██████████| 229/229 [00:02<00:00, 104.45it/s]
Validating: Epoch 175 | Batch 050 | Loss 0.29231: 100%|██████████| 58/58 [00:00<00:00, 89.59it/s] 
Training: Epoch 176 | Batch 220 | Loss 0.04379: 100%|██████████| 229/229 [00:02<00:00, 109.44it/s]
Validating: Epoch 176 | Batch 050 | Loss 0.38744: 100%|██████████| 58/58 [00:00<00:00, 84.77it/s]
Training: Epoch 177 | Batch 220 | Loss 0.06081: 100%|██████████| 229/229 [00:02<00:00, 102.32it/s]
Validating: Epoch 177 | Batch 050 | Loss 0.26701: 100%|██████████| 58/58 [00:00<00:00, 82.46it/s]
Training: Epoch 178 | Batch 220 | Loss 0.03000: 100%|██████████| 229/229 [00:02<00:00, 112.02it/s]
Validating: Epoch 178 | Batch 050 | Loss 0.40629: 100%|██████████| 58/58 [00:00<00:00, 92.41it/s] 
Training: Epoch 179 | Batch 220 | Loss 0.02426: 100%|██████████| 229/229 [00:02<00:00, 114.40it/s]
Validating: Epoch 179 | Batch 050 | Loss 0.23053: 100%|██████████| 58/58 [00:00<00:00, 92.49it/s] 
Training: Epoch 180 | Batch 220 | Loss 0.06186: 100%|██████████| 229/229 [00:02<00:00, 114.25it/s]
Validating: Epoch 180 | Batch 050 | Loss 0.26939: 100%|██████████| 58/58 [00:00<00:00, 87.93it/s] 
Training: Epoch 181 | Batch 220 | Loss 0.08052: 100%|██████████| 229/229 [00:01<00:00, 115.84it/s]
Validating: Epoch 181 | Batch 050 | Loss 0.33512: 100%|██████████| 58/58 [00:00<00:00, 88.10it/s]
Training: Epoch 182 | Batch 220 | Loss 0.06624: 100%|██████████| 229/229 [00:01<00:00, 121.11it/s]
Validating: Epoch 182 | Batch 050 | Loss 0.55068: 100%|██████████| 58/58 [00:00<00:00, 88.76it/s] 
Training: Epoch 183 | Batch 220 | Loss 0.06722: 100%|██████████| 229/229 [00:01<00:00, 116.33it/s]
Validating: Epoch 183 | Batch 050 | Loss 0.59151: 100%|██████████| 58/58 [00:00<00:00, 93.84it/s] 
Training: Epoch 184 | Batch 220 | Loss 0.05626: 100%|██████████| 229/229 [00:02<00:00, 111.22it/s]
Validating: Epoch 184 | Batch 050 | Loss 0.63717: 100%|██████████| 58/58 [00:00<00:00, 94.78it/s] 
Training: Epoch 185 | Batch 220 | Loss 0.02416: 100%|██████████| 229/229 [00:01<00:00, 116.81it/s]
Validating: Epoch 185 | Batch 050 | Loss 0.27119: 100%|██████████| 58/58 [00:00<00:00, 97.74it/s] 
Training: Epoch 186 | Batch 220 | Loss 0.02386: 100%|██████████| 229/229 [00:01<00:00, 116.98it/s]
Validating: Epoch 186 | Batch 050 | Loss 0.42549: 100%|██████████| 58/58 [00:00<00:00, 96.40it/s] 
Training: Epoch 187 | Batch 220 | Loss 0.05014: 100%|██████████| 229/229 [00:01<00:00, 117.47it/s]
Validating: Epoch 187 | Batch 050 | Loss 0.26539: 100%|██████████| 58/58 [00:00<00:00, 93.43it/s] 
Training: Epoch 188 | Batch 220 | Loss 0.08103: 100%|██████████| 229/229 [00:02<00:00, 107.99it/s]
Validating: Epoch 188 | Batch 050 | Loss 0.42867: 100%|██████████| 58/58 [00:00<00:00, 93.90it/s] 
Training: Epoch 189 | Batch 220 | Loss 0.03994: 100%|██████████| 229/229 [00:02<00:00, 110.79it/s]
Validating: Epoch 189 | Batch 050 | Loss 0.20997: 100%|██████████| 58/58 [00:00<00:00, 101.90it/s]
Training: Epoch 190 | Batch 220 | Loss 0.02856: 100%|██████████| 229/229 [00:02<00:00, 109.60it/s]
Validating: Epoch 190 | Batch 050 | Loss 0.42499: 100%|██████████| 58/58 [00:00<00:00, 87.30it/s] 
Training: Epoch 191 | Batch 220 | Loss 0.03696: 100%|██████████| 229/229 [00:02<00:00, 112.70it/s]
Validating: Epoch 191 | Batch 050 | Loss 0.20781: 100%|██████████| 58/58 [00:00<00:00, 91.49it/s] 
Training: Epoch 192 | Batch 220 | Loss 0.01678: 100%|██████████| 229/229 [00:02<00:00, 113.92it/s]
Validating: Epoch 192 | Batch 050 | Loss 0.49759: 100%|██████████| 58/58 [00:00<00:00, 86.91it/s]
Training: Epoch 193 | Batch 220 | Loss 0.05094: 100%|██████████| 229/229 [00:01<00:00, 115.59it/s]
Validating: Epoch 193 | Batch 050 | Loss 0.27137: 100%|██████████| 58/58 [00:00<00:00, 93.03it/s] 
Training: Epoch 194 | Batch 220 | Loss 0.01448: 100%|██████████| 229/229 [00:01<00:00, 117.40it/s]
Validating: Epoch 194 | Batch 050 | Loss 0.46315: 100%|██████████| 58/58 [00:00<00:00, 73.58it/s]
Training: Epoch 195 | Batch 220 | Loss 0.03391: 100%|██████████| 229/229 [00:02<00:00, 108.83it/s]
Validating: Epoch 195 | Batch 050 | Loss 0.18752: 100%|██████████| 58/58 [00:00<00:00, 86.22it/s] 
Training: Epoch 196 | Batch 220 | Loss 0.06228: 100%|██████████| 229/229 [00:02<00:00, 114.17it/s]
Validating: Epoch 196 | Batch 050 | Loss 0.31042: 100%|██████████| 58/58 [00:00<00:00, 89.91it/s] 
Training: Epoch 197 | Batch 220 | Loss 0.04400: 100%|██████████| 229/229 [00:02<00:00, 114.36it/s]
Validating: Epoch 197 | Batch 050 | Loss 0.44140: 100%|██████████| 58/58 [00:00<00:00, 97.48it/s] 
Training: Epoch 198 | Batch 220 | Loss 0.02372: 100%|██████████| 229/229 [00:02<00:00, 110.55it/s]
Validating: Epoch 198 | Batch 050 | Loss 0.21125: 100%|██████████| 58/58 [00:00<00:00, 88.78it/s] 
Training: Epoch 199 | Batch 220 | Loss 0.04676: 100%|██████████| 229/229 [00:02<00:00, 109.39it/s]
Validating: Epoch 199 | Batch 050 | Loss 0.21169: 100%|██████████| 58/58 [00:00<00:00, 88.01it/s] 
Training: Epoch 200 | Batch 220 | Loss 0.03576: 100%|██████████| 229/229 [00:02<00:00, 114.24it/s]
Validating: Epoch 200 | Batch 050 | Loss 0.22380: 100%|██████████| 58/58 [00:00<00:00, 91.53it/s] 
Looping finished
In [ ]:
plot.plot_loss(trainer_mlp.log)
plot.plot_metrics(trainer_mlp.log)
No description has been provided for this image
No description has been provided for this image

We can see that the model converges rather smoothly but there is some overfitting.

Note: I am using a fixed train-validation-test split of the data.

Evaluation¶

In [ ]:
_, model_mlp, _, _ = utils_checkpoints.load(path_dir_exp_mlp / "checkpoints" / "final.pth")
print(torchsummary.summary(model_mlp, [config.MODEL["kwargs"]["num_channels_in"]]))

evaluator_mlp = Evaluator("svhn_mlp", model_mlp)
evaluator_mlp.evaluate()

print(f"Loss on test data: {evaluator_mlp.log["total"]["loss"]}")
print(f"Metrics on test data")
for name, metrics in evaluator_mlp.log["total"]["metrics"].items():
    print(f"    {name:<10}: {metrics}")
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
├─Sequential: 1-1                        [-1, 10]                  --
|    └─Flatten: 2-1                      [-1, 3072]                --
|    └─Linear: 2-2                       [-1, 1024]                3,146,752
|    └─BatchNorm1d: 2-3                  [-1, 1024]                2,048
|    └─ReLU: 2-4                         [-1, 1024]                --
|    └─Dropout: 2-5                      [-1, 1024]                --
|    └─Linear: 2-6                       [-1, 256]                 262,400
|    └─BatchNorm1d: 2-7                  [-1, 256]                 512
|    └─ReLU: 2-8                         [-1, 256]                 --
|    └─Dropout: 2-9                      [-1, 256]                 --
|    └─Linear: 2-10                      [-1, 64]                  16,448
|    └─BatchNorm1d: 2-11                 [-1, 64]                  128
|    └─ReLU: 2-12                        [-1, 64]                  --
|    └─Dropout: 2-13                     [-1, 64]                  --
|    └─Linear: 2-14                      [-1, 10]                  650
==========================================================================================
Total params: 3,428,938
Trainable params: 3,428,938
Non-trainable params: 0
Total mult-adds (M): 6.85
==========================================================================================
Input size (MB): 0.01
Forward/backward pass size (MB): 0.02
Params size (MB): 13.08
Estimated Total Size (MB): 13.11
==========================================================================================
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
├─Sequential: 1-1                        [-1, 10]                  --
|    └─Flatten: 2-1                      [-1, 3072]                --
|    └─Linear: 2-2                       [-1, 1024]                3,146,752
|    └─BatchNorm1d: 2-3                  [-1, 1024]                2,048
|    └─ReLU: 2-4                         [-1, 1024]                --
|    └─Dropout: 2-5                      [-1, 1024]                --
|    └─Linear: 2-6                       [-1, 256]                 262,400
|    └─BatchNorm1d: 2-7                  [-1, 256]                 512
|    └─ReLU: 2-8                         [-1, 256]                 --
|    └─Dropout: 2-9                      [-1, 256]                 --
|    └─Linear: 2-10                      [-1, 64]                  16,448
|    └─BatchNorm1d: 2-11                 [-1, 64]                  128
|    └─ReLU: 2-12                        [-1, 64]                  --
|    └─Dropout: 2-13                     [-1, 64]                  --
|    └─Linear: 2-14                      [-1, 10]                  650
==========================================================================================
Total params: 3,428,938
Trainable params: 3,428,938
Non-trainable params: 0
Total mult-adds (M): 6.85
==========================================================================================
Input size (MB): 0.01
Forward/backward pass size (MB): 0.02
Params size (MB): 13.08
Estimated Total Size (MB): 13.11
==========================================================================================
Setting up dataloader...
Test dataset
Dataset SVHN
    Number of datapoints: 26032
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: test
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloader finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Validating: Batch 100 | Loss 0.97178: 100%|██████████| 102/102 [00:00<00:00, 108.08it/s]
Loss on test data: 1.0617861057177744
Metrics on test data
    Accuracy  : 0.8063537185003073

On the unseen test data, the model reaches an accuracy of approx $0.81$ which is not bad compared to assignment 1.

Training the CNN¶

In [ ]:
path_dir_exp_cnn = Path(config._PATH_DIR_EXPS) / "svhn_cnn"

init_exp.init_exp(name_exp="svhn_cnn", name_config="svhn_cnn")
config.set_config_exp(path_dir_exp_cnn)
Initializing experiment svhn_cnn...
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn/checkpoints
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn/logs
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn/plots
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn/visualizations
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/assignment/configs/svhn_cnn.yaml
Config saved to /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn/config.yaml
Initializing experiment svhn_cnn finished
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn/config.yaml

Optimization of hyperparameters¶

In [ ]:
optimizer_hyperparams_cnn = OptimizerHyperparams(name_exp="svhn_cnn")
optimizer_hyperparams_cnn.create_study(load_if_exists=False)
optimizer_hyperparams_cnn.optimize(num_epochs=10, num_trials=50)
Creating study...
[I 2024-05-15 19:28:38,535] A new study created in RDB with name: study
Creating study finished
Optimizing...
    Trials    : 50
    Epochs    : 10
[I 2024-05-15 19:29:09,461] Trial 0 finished with value: 0.8514196014196014 and parameters: {'Learning rate': 0.0005226676387107944, 'Hidden dimensions of body': '[128]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': None, 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'circular'}. Best is trial 0 with value: 0.8514196014196014.
[I 2024-05-15 19:29:39,605] Trial 1 finished with value: 0.5234097734097735 and parameters: {'Learning rate': 0.0004531904155555682, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[1, 1]', 'Normalization layer in body': None, 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'valid', 'Padding mode in convolution': 'replicate'}. Best is trial 0 with value: 0.8514196014196014.
[I 2024-05-15 19:30:09,837] Trial 2 finished with value: 0.6215533715533715 and parameters: {'Learning rate': 0.0006281013131643014, 'Hidden dimensions of body': '[]', 'Hidden dimensions of head': '[128, 64]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'Softplus', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'replicate'}. Best is trial 0 with value: 0.8514196014196014.
[I 2024-05-15 19:30:40,009] Trial 3 finished with value: 0.19123669123669124 and parameters: {'Learning rate': 0.0006540567313064919, 'Hidden dimensions of body': '[16]', 'Hidden dimensions of head': '[128, 64]', 'Convolution kernel shape': '[1, 1]', 'Normalization layer in body': None, 'Activation layer in body': 'Softplus', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'replicate'}. Best is trial 0 with value: 0.8514196014196014.
[I 2024-05-15 19:31:10,736] Trial 4 finished with value: 0.8547638547638547 and parameters: {'Learning rate': 9.049972093963853e-05, 'Hidden dimensions of body': '[16]', 'Hidden dimensions of head': '[256]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': None, 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'valid', 'Padding mode in convolution': 'circular'}. Best is trial 4 with value: 0.8547638547638547.
[I 2024-05-15 19:31:40,884] Trial 5 finished with value: 0.7954545454545454 and parameters: {'Learning rate': 0.0007448601642645966, 'Hidden dimensions of body': '[16]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': None, 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'replicate'}. Best is trial 4 with value: 0.8547638547638547.
[I 2024-05-15 19:32:11,162] Trial 6 finished with value: 0.8925061425061425 and parameters: {'Learning rate': 0.00026437557797658404, 'Hidden dimensions of body': '[64]', 'Hidden dimensions of head': '[256]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'replicate'}. Best is trial 6 with value: 0.8925061425061425.
[I 2024-05-15 19:32:41,147] Trial 7 finished with value: 0.5214305214305214 and parameters: {'Learning rate': 2.994618518515035e-05, 'Hidden dimensions of body': '[]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'Softplus', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'valid', 'Padding mode in convolution': 'reflect'}. Best is trial 6 with value: 0.8925061425061425.
[I 2024-05-15 19:33:11,370] Trial 8 finished with value: 0.8978296478296478 and parameters: {'Learning rate': 0.0003871080924875663, 'Hidden dimensions of body': '[16]', 'Hidden dimensions of head': '[128, 64]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'circular'}. Best is trial 8 with value: 0.8978296478296478.
[I 2024-05-15 19:33:41,078] Trial 9 finished with value: 0.6554054054054054 and parameters: {'Learning rate': 0.0002584465850671808, 'Hidden dimensions of body': '[128]', 'Hidden dimensions of head': '[64]', 'Convolution kernel shape': '[1, 1]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 8 with value: 0.8978296478296478.
[I 2024-05-15 19:33:44,466] Trial 10 pruned. 
[I 2024-05-15 19:34:14,262] Trial 11 finished with value: 0.8787878787878788 and parameters: {'Learning rate': 0.00031331199393248773, 'Hidden dimensions of body': '[64]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'circular'}. Best is trial 8 with value: 0.8978296478296478.
[I 2024-05-15 19:34:44,533] Trial 12 finished with value: 0.9002184002184003 and parameters: {'Learning rate': 0.0002565533374078692, 'Hidden dimensions of body': '[64]', 'Hidden dimensions of head': '[256, 128]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 12 with value: 0.9002184002184003.
/home/user/karacora/lab-vision-systems-assignments/.venv/lib/python3.12/site-packages/torch/nn/modules/conv.py:456: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
  return F.conv2d(input, weight, bias, self.stride,
[I 2024-05-15 19:35:14,699] Trial 13 finished with value: 0.95004095004095 and parameters: {'Learning rate': 0.00040144388016420196, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[256, 128]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 13 with value: 0.95004095004095.
[I 2024-05-15 19:35:45,060] Trial 14 finished with value: 0.9523614523614523 and parameters: {'Learning rate': 0.00017541075115537567, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[256, 128]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 14 with value: 0.9523614523614523.
[I 2024-05-15 19:36:15,213] Trial 15 finished with value: 0.9441031941031941 and parameters: {'Learning rate': 0.0001524527187797489, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[256, 128]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 14 with value: 0.9523614523614523.
[I 2024-05-15 19:36:44,986] Trial 16 finished with value: 0.9530439530439531 and parameters: {'Learning rate': 0.0001430750251592835, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[256, 128]', 'Convolution kernel shape': '[5, 5]', 'Normalization layer in body': 'BatchNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 16 with value: 0.9530439530439531.
[I 2024-05-15 19:37:18,550] Trial 17 finished with value: 0.9726317226317226 and parameters: {'Learning rate': 0.00015709316905779404, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[256, 128]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 17 with value: 0.9726317226317226.
[I 2024-05-15 19:37:48,959] Trial 18 finished with value: 0.8906633906633906 and parameters: {'Learning rate': 3.8280259033015855e-05, 'Hidden dimensions of body': '[16, 32]', 'Hidden dimensions of head': '[256, 128]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'Softplus', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'valid', 'Padding mode in convolution': 'zeros'}. Best is trial 17 with value: 0.9726317226317226.
[I 2024-05-15 19:38:18,580] Trial 19 finished with value: 0.9287469287469288 and parameters: {'Learning rate': 0.00012724073077134218, 'Hidden dimensions of body': '[32, 64]', 'Hidden dimensions of head': '[64]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 17 with value: 0.9726317226317226.
[I 2024-05-15 19:38:51,580] Trial 20 finished with value: 0.9686049686049686 and parameters: {'Learning rate': 0.0009746534900259671, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 17 with value: 0.9726317226317226.
[I 2024-05-15 19:39:24,410] Trial 21 finished with value: 0.9652607152607152 and parameters: {'Learning rate': 0.0009612244869360221, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 17 with value: 0.9726317226317226.
[I 2024-05-15 19:39:57,532] Trial 22 finished with value: 0.9694922194922195 and parameters: {'Learning rate': 0.0009979808397057603, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 17 with value: 0.9726317226317226.
[I 2024-05-15 19:40:30,464] Trial 23 finished with value: 0.9660797160797161 and parameters: {'Learning rate': 0.0008742326815205995, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'zeros'}. Best is trial 17 with value: 0.9726317226317226.
/home/user/karacora/lab-vision-systems-assignments/.venv/lib/python3.12/site-packages/torch/nn/modules/conv.py:453: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered internally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.)
  return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
[I 2024-05-15 19:41:07,158] Trial 24 finished with value: 0.9804122304122305 and parameters: {'Learning rate': 0.0008464003766435961, 'Hidden dimensions of body': '[32, 64, 128]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 24 with value: 0.9804122304122305.
[I 2024-05-15 19:41:37,042] Trial 25 finished with value: 0.9394621894621895 and parameters: {'Learning rate': 0.0008108937957324257, 'Hidden dimensions of body': '[32, 64]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 24 with value: 0.9804122304122305.
[I 2024-05-15 19:41:40,531] Trial 26 pruned. 
[I 2024-05-15 19:42:37,211] Trial 27 finished with value: 0.984984984984985 and parameters: {'Learning rate': 0.0007355658045885233, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:43:14,324] Trial 28 finished with value: 0.9593229593229593 and parameters: {'Learning rate': 0.0006709766859437004, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'Softplus', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:43:50,511] Trial 29 finished with value: 0.975975975975976 and parameters: {'Learning rate': 0.0005593102027037774, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:44:26,689] Trial 30 finished with value: 0.9757712257712258 and parameters: {'Learning rate': 0.0005569442034104378, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:45:02,928] Trial 31 finished with value: 0.9790472290472291 and parameters: {'Learning rate': 0.0005386129229970634, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:45:39,014] Trial 32 finished with value: 0.9781599781599781 and parameters: {'Learning rate': 0.0005523277020452859, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:46:15,224] Trial 33 finished with value: 0.9765902265902265 and parameters: {'Learning rate': 0.0007561077853370539, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:46:51,384] Trial 34 finished with value: 0.9771362271362272 and parameters: {'Learning rate': 0.0004823452838460006, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:47:21,308] Trial 35 finished with value: 0.8579033579033579 and parameters: {'Learning rate': 0.0006053320515494303, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': None, 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'valid', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:47:52,250] Trial 36 finished with value: 0.6233961233961234 and parameters: {'Learning rate': 0.0007171502535335611, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[1, 1]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:48:22,495] Trial 37 finished with value: 0.8566066066066066 and parameters: {'Learning rate': 0.0008229998810430904, 'Hidden dimensions of body': '[16, 32]', 'Hidden dimensions of head': '[256]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': None, 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:48:52,344] Trial 38 finished with value: 0.43297843297843297 and parameters: {'Learning rate': 0.0006844986839819062, 'Hidden dimensions of body': '[128]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'Softplus', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'valid', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:49:22,220] Trial 39 finished with value: 0.6349303849303849 and parameters: {'Learning rate': 0.0006036691750302404, 'Hidden dimensions of body': '[]', 'Hidden dimensions of head': '[128, 64]', 'Convolution kernel shape': '[1, 1]', 'Normalization layer in body': None, 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'replicate'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:49:58,858] Trial 40 finished with value: 0.9778187278187278 and parameters: {'Learning rate': 0.00043553502436264864, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:50:35,514] Trial 41 finished with value: 0.9776139776139776 and parameters: {'Learning rate': 0.0005147252552661301, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:51:12,150] Trial 42 finished with value: 0.9763854763854763 and parameters: {'Learning rate': 0.00045558688849476166, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:51:52,733] Trial 43 finished with value: 0.976044226044226 and parameters: {'Learning rate': 0.0004005504219800072, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'circular'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:52:29,817] Trial 44 finished with value: 0.9713349713349714 and parameters: {'Learning rate': 0.00035925903536287757, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[64]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:53:00,371] Trial 45 finished with value: 0.8037128037128037 and parameters: {'Learning rate': 0.0005656734984171915, 'Hidden dimensions of body': '[16]', 'Hidden dimensions of head': '[64, 32]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'Softplus', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:53:30,912] Trial 46 finished with value: 0.5238192738192738 and parameters: {'Learning rate': 0.000912952300058452, 'Hidden dimensions of body': '[]', 'Hidden dimensions of head': '[128]', 'Convolution kernel shape': '[1, 1]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'replicate'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:53:34,448] Trial 47 pruned. 
[I 2024-05-15 19:54:05,632] Trial 48 finished with value: 0.8641141141141141 and parameters: {'Learning rate': 0.0004761202935343087, 'Hidden dimensions of body': '[128]', 'Hidden dimensions of head': '[256]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'circular'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:55:03,756] Trial 49 finished with value: 0.9754299754299754 and parameters: {'Learning rate': 0.0006431269292488955, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[128, 64]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'AvgPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:55:33,772] Trial 50 finished with value: 0.9117526617526618 and parameters: {'Learning rate': 0.000334345620891303, 'Hidden dimensions of body': '[64]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:56:10,316] Trial 51 finished with value: 0.9722222222222222 and parameters: {'Learning rate': 0.0005378318278590291, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
[I 2024-05-15 19:56:46,945] Trial 52 finished with value: 0.9770679770679771 and parameters: {'Learning rate': 0.00045354795504501025, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[512]', 'Convolution kernel shape': '[3, 3]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'LeakyReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': False, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}. Best is trial 27 with value: 0.984984984984985.
Study results
    Trials finished: 53
    Trials completed: 50
Best trial
    Number    : 27
    Value     : 0.984984984984985
    Params    : {'Learning rate': 0.0007355658045885233, 'Hidden dimensions of body': '[64, 128, 256]', 'Hidden dimensions of head': '[]', 'Convolution kernel shape': '[7, 7]', 'Normalization layer in body': 'InstanceNorm2d', 'Activation layer in body': 'ReLU', 'Pooling layer in body': 'MaxPool2d', 'Usage of bias in convolution': True, 'Padding in convolution': 'same', 'Padding mode in convolution': 'reflect'}
Optimizing finished
In [ ]:
study_cnn = utils_optuna.load_study(path_db=path_dir_exp_cnn / "optuna.db")
name_target_cnn = config.OPTIMIZATION_HYPERPARAMS["metric"]

print(f"Best {name_target_cnn}: {study_cnn.best_value}")
print(f"Best parameters")
for param, value in study_cnn.best_params.items():
    print(f"    {param:<30}: {value}")
Best Accuracy: 0.984984984984985
Best parameters
    Learning rate                 : 0.0007355658045885233
    Hidden dimensions of body     : [64, 128, 256]
    Hidden dimensions of head     : []
    Convolution kernel shape      : [7, 7]
    Normalization layer in body   : InstanceNorm2d
    Activation layer in body      : ReLU
    Pooling layer in body         : MaxPool2d
    Usage of bias in convolution  : True
    Padding in convolution        : same
    Padding mode in convolution   : reflect
In [ ]:
optuna.visualization.plot_slice(study_cnn, target_name=name_target_cnn).show()
optuna.visualization.plot_param_importances(study_cnn, target_name=name_target_cnn).show()

Again, I made some slight changes like reducing the hidden dimension sizes to $[32, 64, 128]$ which is almost as good as a much larger model. I set the convolutional kernel size to $[5, 5]$. The corresponding experiment config file has been updated accordingly.

Training¶

In [ ]:
trainer_cnn = Trainer("svhn_cnn")
trainer_cnn.loop(config.TRAINING["num_epochs"])
Setting up dataloaders...
Train dataset
Dataset SVHN
    Number of datapoints: 58605
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: train
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Validate dataset
Dataset SVHN
    Number of datapoints: 14652
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: validate
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloaders finished
Setting up model...
Model
CNN2d(
  (body): Sequential(
    (0): BlockCNN2d(
      (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
    (1): BlockCNN2d(
      (0): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
    (2): BlockCNN2d(
      (0): Conv2d(64, 128, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
  )
  (head): MLP(
    (head): Sequential(
      (0): Flatten(start_dim=1, end_dim=-1)
      (1): Linear(in_features=2048, out_features=10, bias=True)
    )
  )
)
Setting up model finished
Setting up optimizer...
Setting up optimizer finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Looping...
Validating: Epoch 000 | Batch 050 | Loss 2.47663: 100%|██████████| 58/58 [00:00<00:00, 89.43it/s] 
Training: Epoch 001 | Batch 220 | Loss 0.53636: 100%|██████████| 229/229 [00:02<00:00, 111.04it/s]
Validating: Epoch 001 | Batch 050 | Loss 0.53411: 100%|██████████| 58/58 [00:00<00:00, 86.51it/s] 
Training: Epoch 002 | Batch 220 | Loss 0.40333: 100%|██████████| 229/229 [00:02<00:00, 109.03it/s]
Validating: Epoch 002 | Batch 050 | Loss 0.39416: 100%|██████████| 58/58 [00:00<00:00, 86.59it/s] 
Training: Epoch 003 | Batch 220 | Loss 0.29951: 100%|██████████| 229/229 [00:02<00:00, 108.46it/s]
Validating: Epoch 003 | Batch 050 | Loss 0.35821: 100%|██████████| 58/58 [00:00<00:00, 97.97it/s] 
Training: Epoch 004 | Batch 220 | Loss 0.27483: 100%|██████████| 229/229 [00:02<00:00, 103.50it/s]
Validating: Epoch 004 | Batch 050 | Loss 0.30979: 100%|██████████| 58/58 [00:00<00:00, 89.97it/s] 
Training: Epoch 005 | Batch 220 | Loss 0.28334: 100%|██████████| 229/229 [00:02<00:00, 107.59it/s]
Validating: Epoch 005 | Batch 050 | Loss 0.29450: 100%|██████████| 58/58 [00:00<00:00, 91.26it/s] 
Training: Epoch 006 | Batch 220 | Loss 0.28184: 100%|██████████| 229/229 [00:02<00:00, 102.00it/s]
Validating: Epoch 006 | Batch 050 | Loss 0.27201: 100%|██████████| 58/58 [00:00<00:00, 90.99it/s] 
Training: Epoch 007 | Batch 220 | Loss 0.23532: 100%|██████████| 229/229 [00:02<00:00, 108.62it/s]
Validating: Epoch 007 | Batch 050 | Loss 0.23032: 100%|██████████| 58/58 [00:00<00:00, 85.86it/s]
Training: Epoch 008 | Batch 220 | Loss 0.17018: 100%|██████████| 229/229 [00:02<00:00, 108.84it/s]
Validating: Epoch 008 | Batch 050 | Loss 0.22269: 100%|██████████| 58/58 [00:00<00:00, 87.32it/s] 
Training: Epoch 009 | Batch 220 | Loss 0.12288: 100%|██████████| 229/229 [00:02<00:00, 110.11it/s]
Validating: Epoch 009 | Batch 050 | Loss 0.20386: 100%|██████████| 58/58 [00:00<00:00, 91.08it/s] 
Training: Epoch 010 | Batch 220 | Loss 0.15850: 100%|██████████| 229/229 [00:02<00:00, 110.13it/s]
Validating: Epoch 010 | Batch 050 | Loss 0.20192: 100%|██████████| 58/58 [00:00<00:00, 93.18it/s] 
Training: Epoch 011 | Batch 220 | Loss 0.11558: 100%|██████████| 229/229 [00:02<00:00, 108.42it/s]
Validating: Epoch 011 | Batch 050 | Loss 0.17454: 100%|██████████| 58/58 [00:00<00:00, 91.39it/s] 
Training: Epoch 012 | Batch 220 | Loss 0.20818: 100%|██████████| 229/229 [00:02<00:00, 109.06it/s]
Validating: Epoch 012 | Batch 050 | Loss 0.15967: 100%|██████████| 58/58 [00:00<00:00, 81.64it/s]
Training: Epoch 013 | Batch 220 | Loss 0.16830: 100%|██████████| 229/229 [00:02<00:00, 108.28it/s]
Validating: Epoch 013 | Batch 050 | Loss 0.15982: 100%|██████████| 58/58 [00:00<00:00, 85.55it/s]
Training: Epoch 014 | Batch 220 | Loss 0.07806: 100%|██████████| 229/229 [00:02<00:00, 108.02it/s]
Validating: Epoch 014 | Batch 050 | Loss 0.14530: 100%|██████████| 58/58 [00:00<00:00, 90.33it/s] 
Training: Epoch 015 | Batch 220 | Loss 0.16797: 100%|██████████| 229/229 [00:02<00:00, 105.55it/s]
Validating: Epoch 015 | Batch 050 | Loss 0.11693: 100%|██████████| 58/58 [00:00<00:00, 85.69it/s] 
Training: Epoch 016 | Batch 220 | Loss 0.07693: 100%|██████████| 229/229 [00:02<00:00, 108.19it/s]
Validating: Epoch 016 | Batch 050 | Loss 0.12011: 100%|██████████| 58/58 [00:00<00:00, 94.52it/s] 
Training: Epoch 017 | Batch 220 | Loss 0.11185: 100%|██████████| 229/229 [00:02<00:00, 108.97it/s]
Validating: Epoch 017 | Batch 050 | Loss 0.09908: 100%|██████████| 58/58 [00:00<00:00, 87.80it/s] 
Training: Epoch 018 | Batch 220 | Loss 0.08770: 100%|██████████| 229/229 [00:02<00:00, 108.71it/s]
Validating: Epoch 018 | Batch 050 | Loss 0.09566: 100%|██████████| 58/58 [00:00<00:00, 88.07it/s] 
Training: Epoch 019 | Batch 220 | Loss 0.04650: 100%|██████████| 229/229 [00:02<00:00, 108.24it/s]
Validating: Epoch 019 | Batch 050 | Loss 0.08606: 100%|██████████| 58/58 [00:00<00:00, 93.44it/s] 
Training: Epoch 020 | Batch 220 | Loss 0.04938: 100%|██████████| 229/229 [00:02<00:00, 108.41it/s]
Validating: Epoch 020 | Batch 050 | Loss 0.07826: 100%|██████████| 58/58 [00:00<00:00, 95.50it/s] 
Training: Epoch 021 | Batch 220 | Loss 0.03820: 100%|██████████| 229/229 [00:02<00:00, 106.52it/s]
Validating: Epoch 021 | Batch 050 | Loss 0.07866: 100%|██████████| 58/58 [00:00<00:00, 83.90it/s] 
Training: Epoch 022 | Batch 220 | Loss 0.05641: 100%|██████████| 229/229 [00:02<00:00, 106.38it/s]
Validating: Epoch 022 | Batch 050 | Loss 0.06992: 100%|██████████| 58/58 [00:00<00:00, 93.62it/s] 
Training: Epoch 023 | Batch 220 | Loss 0.07780: 100%|██████████| 229/229 [00:02<00:00, 108.82it/s]
Validating: Epoch 023 | Batch 050 | Loss 0.07496: 100%|██████████| 58/58 [00:00<00:00, 84.25it/s]
Training: Epoch 024 | Batch 220 | Loss 0.02588: 100%|██████████| 229/229 [00:02<00:00, 106.96it/s]
Validating: Epoch 024 | Batch 050 | Loss 0.06287: 100%|██████████| 58/58 [00:00<00:00, 84.94it/s]
Training: Epoch 025 | Batch 220 | Loss 0.02369: 100%|██████████| 229/229 [00:02<00:00, 106.34it/s]
Validating: Epoch 025 | Batch 050 | Loss 0.06064: 100%|██████████| 58/58 [00:00<00:00, 91.52it/s] 
Training: Epoch 026 | Batch 220 | Loss 0.02436: 100%|██████████| 229/229 [00:02<00:00, 105.12it/s]
Validating: Epoch 026 | Batch 050 | Loss 0.06041: 100%|██████████| 58/58 [00:00<00:00, 93.34it/s] 
Training: Epoch 027 | Batch 220 | Loss 0.02162: 100%|██████████| 229/229 [00:02<00:00, 105.61it/s]
Validating: Epoch 027 | Batch 050 | Loss 0.06823: 100%|██████████| 58/58 [00:00<00:00, 83.09it/s]
Training: Epoch 028 | Batch 220 | Loss 0.01593: 100%|██████████| 229/229 [00:02<00:00, 107.60it/s]
Validating: Epoch 028 | Batch 050 | Loss 0.06039: 100%|██████████| 58/58 [00:00<00:00, 79.58it/s] 
Training: Epoch 029 | Batch 220 | Loss 0.01252: 100%|██████████| 229/229 [00:02<00:00, 107.54it/s]
Validating: Epoch 029 | Batch 050 | Loss 0.06215: 100%|██████████| 58/58 [00:00<00:00, 85.73it/s] 
Training: Epoch 030 | Batch 220 | Loss 0.01623: 100%|██████████| 229/229 [00:02<00:00, 107.01it/s]
Validating: Epoch 030 | Batch 050 | Loss 0.06640: 100%|██████████| 58/58 [00:00<00:00, 88.16it/s] 
Training: Epoch 031 | Batch 220 | Loss 0.01324: 100%|██████████| 229/229 [00:02<00:00, 106.02it/s]
Validating: Epoch 031 | Batch 050 | Loss 0.05819: 100%|██████████| 58/58 [00:00<00:00, 87.39it/s] 
Training: Epoch 032 | Batch 220 | Loss 0.00883: 100%|██████████| 229/229 [00:02<00:00, 107.69it/s]
Validating: Epoch 032 | Batch 050 | Loss 0.05541: 100%|██████████| 58/58 [00:00<00:00, 95.59it/s] 
Training: Epoch 033 | Batch 220 | Loss 0.00902: 100%|██████████| 229/229 [00:02<00:00, 108.76it/s]
Validating: Epoch 033 | Batch 050 | Loss 0.05581: 100%|██████████| 58/58 [00:00<00:00, 88.83it/s] 
Training: Epoch 034 | Batch 220 | Loss 0.01072: 100%|██████████| 229/229 [00:02<00:00, 107.65it/s]
Validating: Epoch 034 | Batch 050 | Loss 0.06075: 100%|██████████| 58/58 [00:00<00:00, 92.42it/s] 
Training: Epoch 035 | Batch 220 | Loss 0.00773: 100%|██████████| 229/229 [00:02<00:00, 106.50it/s]
Validating: Epoch 035 | Batch 050 | Loss 0.06542: 100%|██████████| 58/58 [00:00<00:00, 90.29it/s] 
Training: Epoch 036 | Batch 220 | Loss 0.00861: 100%|██████████| 229/229 [00:02<00:00, 108.11it/s]
Validating: Epoch 036 | Batch 050 | Loss 0.06170: 100%|██████████| 58/58 [00:00<00:00, 82.79it/s]
Training: Epoch 037 | Batch 220 | Loss 0.00477: 100%|██████████| 229/229 [00:02<00:00, 108.92it/s]
Validating: Epoch 037 | Batch 050 | Loss 0.06213: 100%|██████████| 58/58 [00:00<00:00, 92.48it/s] 
Training: Epoch 038 | Batch 220 | Loss 0.00402: 100%|██████████| 229/229 [00:02<00:00, 107.99it/s]
Validating: Epoch 038 | Batch 050 | Loss 0.05861: 100%|██████████| 58/58 [00:00<00:00, 88.02it/s] 
Training: Epoch 039 | Batch 220 | Loss 0.00514: 100%|██████████| 229/229 [00:02<00:00, 107.04it/s]
Validating: Epoch 039 | Batch 050 | Loss 0.06260: 100%|██████████| 58/58 [00:00<00:00, 88.46it/s] 
Training: Epoch 040 | Batch 220 | Loss 0.00453: 100%|██████████| 229/229 [00:02<00:00, 107.51it/s]
Validating: Epoch 040 | Batch 050 | Loss 0.06156: 100%|██████████| 58/58 [00:00<00:00, 82.97it/s]
Training: Epoch 041 | Batch 220 | Loss 0.00507: 100%|██████████| 229/229 [00:02<00:00, 106.90it/s]
Validating: Epoch 041 | Batch 050 | Loss 0.07695: 100%|██████████| 58/58 [00:00<00:00, 88.80it/s] 
Training: Epoch 042 | Batch 220 | Loss 0.02265: 100%|██████████| 229/229 [00:02<00:00, 108.29it/s]
Validating: Epoch 042 | Batch 050 | Loss 0.07242: 100%|██████████| 58/58 [00:00<00:00, 89.90it/s] 
Training: Epoch 043 | Batch 220 | Loss 0.00512: 100%|██████████| 229/229 [00:02<00:00, 108.94it/s]
Validating: Epoch 043 | Batch 050 | Loss 0.05263: 100%|██████████| 58/58 [00:00<00:00, 94.38it/s] 
Training: Epoch 044 | Batch 220 | Loss 0.00229: 100%|██████████| 229/229 [00:02<00:00, 106.22it/s]
Validating: Epoch 044 | Batch 050 | Loss 0.05820: 100%|██████████| 58/58 [00:00<00:00, 83.52it/s]
Training: Epoch 045 | Batch 220 | Loss 0.00244: 100%|██████████| 229/229 [00:02<00:00, 103.83it/s]
Validating: Epoch 045 | Batch 050 | Loss 0.05675: 100%|██████████| 58/58 [00:00<00:00, 89.91it/s] 
Training: Epoch 046 | Batch 220 | Loss 0.00205: 100%|██████████| 229/229 [00:02<00:00, 106.80it/s]
Validating: Epoch 046 | Batch 050 | Loss 0.05480: 100%|██████████| 58/58 [00:00<00:00, 90.15it/s] 
Training: Epoch 047 | Batch 220 | Loss 0.00227: 100%|██████████| 229/229 [00:02<00:00, 106.91it/s]
Validating: Epoch 047 | Batch 050 | Loss 0.05601: 100%|██████████| 58/58 [00:00<00:00, 86.34it/s] 
Training: Epoch 048 | Batch 220 | Loss 0.00230: 100%|██████████| 229/229 [00:02<00:00, 108.70it/s]
Validating: Epoch 048 | Batch 050 | Loss 0.05762: 100%|██████████| 58/58 [00:00<00:00, 86.65it/s] 
Training: Epoch 049 | Batch 220 | Loss 0.00171: 100%|██████████| 229/229 [00:02<00:00, 106.84it/s]
Validating: Epoch 049 | Batch 050 | Loss 0.05936: 100%|██████████| 58/58 [00:00<00:00, 91.53it/s] 
Training: Epoch 050 | Batch 220 | Loss 0.00185: 100%|██████████| 229/229 [00:02<00:00, 105.37it/s]
Validating: Epoch 050 | Batch 050 | Loss 0.05822: 100%|██████████| 58/58 [00:00<00:00, 87.63it/s] 
Training: Epoch 051 | Batch 220 | Loss 0.00146: 100%|██████████| 229/229 [00:02<00:00, 107.60it/s]
Validating: Epoch 051 | Batch 050 | Loss 0.05986: 100%|██████████| 58/58 [00:00<00:00, 90.70it/s] 
Training: Epoch 052 | Batch 220 | Loss 0.00144: 100%|██████████| 229/229 [00:02<00:00, 105.67it/s]
Validating: Epoch 052 | Batch 050 | Loss 0.06116: 100%|██████████| 58/58 [00:00<00:00, 90.76it/s] 
Training: Epoch 053 | Batch 220 | Loss 0.00166: 100%|██████████| 229/229 [00:02<00:00, 106.10it/s]
Validating: Epoch 053 | Batch 050 | Loss 0.05842: 100%|██████████| 58/58 [00:00<00:00, 91.22it/s] 
Training: Epoch 054 | Batch 220 | Loss 0.00241: 100%|██████████| 229/229 [00:02<00:00, 102.91it/s]
Validating: Epoch 054 | Batch 050 | Loss 0.05751: 100%|██████████| 58/58 [00:00<00:00, 97.09it/s] 
Training: Epoch 055 | Batch 220 | Loss 0.00123: 100%|██████████| 229/229 [00:02<00:00, 105.30it/s]
Validating: Epoch 055 | Batch 050 | Loss 0.06182: 100%|██████████| 58/58 [00:00<00:00, 89.01it/s] 
Training: Epoch 056 | Batch 220 | Loss 0.03252: 100%|██████████| 229/229 [00:02<00:00, 110.65it/s]
Validating: Epoch 056 | Batch 050 | Loss 0.16650: 100%|██████████| 58/58 [00:00<00:00, 89.06it/s] 
Training: Epoch 057 | Batch 220 | Loss 0.01707: 100%|██████████| 229/229 [00:02<00:00, 105.40it/s]
Validating: Epoch 057 | Batch 050 | Loss 0.05987: 100%|██████████| 58/58 [00:00<00:00, 84.92it/s]
Training: Epoch 058 | Batch 220 | Loss 0.00228: 100%|██████████| 229/229 [00:02<00:00, 107.83it/s]
Validating: Epoch 058 | Batch 050 | Loss 0.06267: 100%|██████████| 58/58 [00:00<00:00, 95.49it/s] 
Training: Epoch 059 | Batch 220 | Loss 0.00170: 100%|██████████| 229/229 [00:02<00:00, 108.25it/s]
Validating: Epoch 059 | Batch 050 | Loss 0.06241: 100%|██████████| 58/58 [00:00<00:00, 84.93it/s]
Training: Epoch 060 | Batch 220 | Loss 0.00113: 100%|██████████| 229/229 [00:02<00:00, 109.46it/s]
Validating: Epoch 060 | Batch 050 | Loss 0.06262: 100%|██████████| 58/58 [00:00<00:00, 83.84it/s] 
Training: Epoch 061 | Batch 220 | Loss 0.00165: 100%|██████████| 229/229 [00:02<00:00, 107.48it/s]
Validating: Epoch 061 | Batch 050 | Loss 0.06174: 100%|██████████| 58/58 [00:00<00:00, 87.96it/s] 
Training: Epoch 062 | Batch 220 | Loss 0.00143: 100%|██████████| 229/229 [00:02<00:00, 109.12it/s]
Validating: Epoch 062 | Batch 050 | Loss 0.06411: 100%|██████████| 58/58 [00:00<00:00, 94.59it/s] 
Training: Epoch 063 | Batch 220 | Loss 0.00141: 100%|██████████| 229/229 [00:02<00:00, 106.45it/s]
Validating: Epoch 063 | Batch 050 | Loss 0.06081: 100%|██████████| 58/58 [00:00<00:00, 91.88it/s] 
Training: Epoch 064 | Batch 220 | Loss 0.00124: 100%|██████████| 229/229 [00:02<00:00, 105.55it/s]
Validating: Epoch 064 | Batch 050 | Loss 0.05898: 100%|██████████| 58/58 [00:00<00:00, 82.49it/s]
Training: Epoch 065 | Batch 220 | Loss 0.00106: 100%|██████████| 229/229 [00:02<00:00, 109.12it/s]
Validating: Epoch 065 | Batch 050 | Loss 0.06315: 100%|██████████| 58/58 [00:00<00:00, 91.09it/s] 
Training: Epoch 066 | Batch 220 | Loss 0.00087: 100%|██████████| 229/229 [00:02<00:00, 105.36it/s]
Validating: Epoch 066 | Batch 050 | Loss 0.06229: 100%|██████████| 58/58 [00:00<00:00, 88.54it/s] 
Training: Epoch 067 | Batch 220 | Loss 0.00071: 100%|██████████| 229/229 [00:02<00:00, 106.59it/s]
Validating: Epoch 067 | Batch 050 | Loss 0.06264: 100%|██████████| 58/58 [00:00<00:00, 88.56it/s] 
Training: Epoch 068 | Batch 220 | Loss 0.00061: 100%|██████████| 229/229 [00:02<00:00, 104.76it/s]
Validating: Epoch 068 | Batch 050 | Loss 0.06466: 100%|██████████| 58/58 [00:00<00:00, 85.92it/s]
Training: Epoch 069 | Batch 220 | Loss 0.00119: 100%|██████████| 229/229 [00:02<00:00, 107.11it/s]
Validating: Epoch 069 | Batch 050 | Loss 0.06093: 100%|██████████| 58/58 [00:00<00:00, 93.12it/s] 
Training: Epoch 070 | Batch 220 | Loss 0.00089: 100%|██████████| 229/229 [00:02<00:00, 109.40it/s]
Validating: Epoch 070 | Batch 050 | Loss 0.06358: 100%|██████████| 58/58 [00:00<00:00, 85.71it/s]
Training: Epoch 071 | Batch 220 | Loss 0.00057: 100%|██████████| 229/229 [00:02<00:00, 107.64it/s]
Validating: Epoch 071 | Batch 050 | Loss 0.06452: 100%|██████████| 58/58 [00:00<00:00, 91.85it/s] 
Training: Epoch 072 | Batch 220 | Loss 0.00092: 100%|██████████| 229/229 [00:02<00:00, 107.31it/s]
Validating: Epoch 072 | Batch 050 | Loss 0.06074: 100%|██████████| 58/58 [00:00<00:00, 84.90it/s] 
Training: Epoch 073 | Batch 220 | Loss 0.03255: 100%|██████████| 229/229 [00:02<00:00, 108.84it/s]
Validating: Epoch 073 | Batch 050 | Loss 0.08504: 100%|██████████| 58/58 [00:00<00:00, 87.27it/s] 
Training: Epoch 074 | Batch 220 | Loss 0.00281: 100%|██████████| 229/229 [00:02<00:00, 106.96it/s]
Validating: Epoch 074 | Batch 050 | Loss 0.06373: 100%|██████████| 58/58 [00:00<00:00, 89.18it/s] 
Training: Epoch 075 | Batch 220 | Loss 0.00223: 100%|██████████| 229/229 [00:02<00:00, 108.58it/s]
Validating: Epoch 075 | Batch 050 | Loss 0.06545: 100%|██████████| 58/58 [00:00<00:00, 87.87it/s] 
Training: Epoch 076 | Batch 220 | Loss 0.00103: 100%|██████████| 229/229 [00:02<00:00, 107.56it/s]
Validating: Epoch 076 | Batch 050 | Loss 0.06357: 100%|██████████| 58/58 [00:00<00:00, 86.94it/s] 
Training: Epoch 077 | Batch 220 | Loss 0.00105: 100%|██████████| 229/229 [00:02<00:00, 109.00it/s]
Validating: Epoch 077 | Batch 050 | Loss 0.06300: 100%|██████████| 58/58 [00:00<00:00, 87.49it/s] 
Training: Epoch 078 | Batch 220 | Loss 0.00093: 100%|██████████| 229/229 [00:02<00:00, 106.90it/s]
Validating: Epoch 078 | Batch 050 | Loss 0.06403: 100%|██████████| 58/58 [00:00<00:00, 82.95it/s]
Training: Epoch 079 | Batch 220 | Loss 0.00065: 100%|██████████| 229/229 [00:02<00:00, 107.72it/s]
Validating: Epoch 079 | Batch 050 | Loss 0.06398: 100%|██████████| 58/58 [00:00<00:00, 93.44it/s] 
Training: Epoch 080 | Batch 220 | Loss 0.00077: 100%|██████████| 229/229 [00:02<00:00, 107.04it/s]
Validating: Epoch 080 | Batch 050 | Loss 0.06301: 100%|██████████| 58/58 [00:00<00:00, 91.67it/s] 
Training: Epoch 081 | Batch 220 | Loss 0.00074: 100%|██████████| 229/229 [00:02<00:00, 106.59it/s]
Validating: Epoch 081 | Batch 050 | Loss 0.06515: 100%|██████████| 58/58 [00:00<00:00, 82.80it/s]
Training: Epoch 082 | Batch 220 | Loss 0.00177: 100%|██████████| 229/229 [00:02<00:00, 109.86it/s]
Validating: Epoch 082 | Batch 050 | Loss 0.06559: 100%|██████████| 58/58 [00:00<00:00, 87.76it/s] 
Training: Epoch 083 | Batch 220 | Loss 0.00119: 100%|██████████| 229/229 [00:02<00:00, 108.30it/s]
Validating: Epoch 083 | Batch 050 | Loss 0.06914: 100%|██████████| 58/58 [00:00<00:00, 91.36it/s] 
Training: Epoch 084 | Batch 220 | Loss 0.00067: 100%|██████████| 229/229 [00:02<00:00, 108.10it/s]
Validating: Epoch 084 | Batch 050 | Loss 0.06497: 100%|██████████| 58/58 [00:00<00:00, 88.19it/s] 
Training: Epoch 085 | Batch 220 | Loss 0.00093: 100%|██████████| 229/229 [00:02<00:00, 108.87it/s]
Validating: Epoch 085 | Batch 050 | Loss 0.06619: 100%|██████████| 58/58 [00:00<00:00, 93.81it/s] 
Training: Epoch 086 | Batch 220 | Loss 0.00057: 100%|██████████| 229/229 [00:02<00:00, 107.61it/s]
Validating: Epoch 086 | Batch 050 | Loss 0.06185: 100%|██████████| 58/58 [00:00<00:00, 90.70it/s] 
Training: Epoch 087 | Batch 220 | Loss 0.00115: 100%|██████████| 229/229 [00:02<00:00, 108.85it/s]
Validating: Epoch 087 | Batch 050 | Loss 0.06508: 100%|██████████| 58/58 [00:00<00:00, 86.22it/s] 
Training: Epoch 088 | Batch 220 | Loss 0.00080: 100%|██████████| 229/229 [00:02<00:00, 108.02it/s]
Validating: Epoch 088 | Batch 050 | Loss 0.06461: 100%|██████████| 58/58 [00:00<00:00, 94.06it/s] 
Training: Epoch 089 | Batch 220 | Loss 0.00098: 100%|██████████| 229/229 [00:02<00:00, 107.60it/s]
Validating: Epoch 089 | Batch 050 | Loss 0.06186: 100%|██████████| 58/58 [00:00<00:00, 86.00it/s]
Training: Epoch 090 | Batch 220 | Loss 0.00037: 100%|██████████| 229/229 [00:02<00:00, 107.66it/s]
Validating: Epoch 090 | Batch 050 | Loss 0.06342: 100%|██████████| 58/58 [00:00<00:00, 84.61it/s] 
Training: Epoch 091 | Batch 220 | Loss 0.04161: 100%|██████████| 229/229 [00:02<00:00, 105.12it/s]
Validating: Epoch 091 | Batch 050 | Loss 0.10540: 100%|██████████| 58/58 [00:00<00:00, 89.54it/s] 
Training: Epoch 092 | Batch 220 | Loss 0.00679: 100%|██████████| 229/229 [00:02<00:00, 106.63it/s]
Validating: Epoch 092 | Batch 050 | Loss 0.05895: 100%|██████████| 58/58 [00:00<00:00, 91.37it/s] 
Training: Epoch 093 | Batch 220 | Loss 0.00162: 100%|██████████| 229/229 [00:02<00:00, 109.07it/s]
Validating: Epoch 093 | Batch 050 | Loss 0.05628: 100%|██████████| 58/58 [00:00<00:00, 84.95it/s] 
Training: Epoch 094 | Batch 220 | Loss 0.00116: 100%|██████████| 229/229 [00:02<00:00, 107.00it/s]
Validating: Epoch 094 | Batch 050 | Loss 0.05685: 100%|██████████| 58/58 [00:00<00:00, 84.16it/s]
Training: Epoch 095 | Batch 220 | Loss 0.00089: 100%|██████████| 229/229 [00:02<00:00, 107.45it/s]
Validating: Epoch 095 | Batch 050 | Loss 0.05643: 100%|██████████| 58/58 [00:00<00:00, 92.41it/s] 
Training: Epoch 096 | Batch 220 | Loss 0.00086: 100%|██████████| 229/229 [00:02<00:00, 102.12it/s]
Validating: Epoch 096 | Batch 050 | Loss 0.05822: 100%|██████████| 58/58 [00:00<00:00, 91.48it/s] 
Training: Epoch 097 | Batch 220 | Loss 0.00066: 100%|██████████| 229/229 [00:02<00:00, 107.69it/s]
Validating: Epoch 097 | Batch 050 | Loss 0.05825: 100%|██████████| 58/58 [00:00<00:00, 83.81it/s] 
Training: Epoch 098 | Batch 220 | Loss 0.00068: 100%|██████████| 229/229 [00:02<00:00, 111.23it/s]
Validating: Epoch 098 | Batch 050 | Loss 0.06176: 100%|██████████| 58/58 [00:00<00:00, 90.20it/s] 
Training: Epoch 099 | Batch 220 | Loss 0.00062: 100%|██████████| 229/229 [00:02<00:00, 106.95it/s]
Validating: Epoch 099 | Batch 050 | Loss 0.05953: 100%|██████████| 58/58 [00:00<00:00, 90.91it/s] 
Training: Epoch 100 | Batch 220 | Loss 0.00053: 100%|██████████| 229/229 [00:02<00:00, 106.70it/s]
Validating: Epoch 100 | Batch 050 | Loss 0.06250: 100%|██████████| 58/58 [00:00<00:00, 89.45it/s] 
Looping finished
In [ ]:
plot.plot_loss(trainer_cnn.log)
plot.plot_metrics(trainer_cnn.log)
No description has been provided for this image
No description has been provided for this image

No surprises here. The CNN is much better. It converges quiet fast.

Some weird spikes occured after some epochs. Some small ones are still in the current plot. I strongly reduced the learning rate and which helped (exploding gradients?, numerical issues?). There is still some overfitting but its very slight on an absolute scale. The validation loss is much smoother than for the MLP which might indicate better generalization.

Evaluation¶

In [ ]:
_, model_cnn, _, _ = utils_checkpoints.load(path_dir_exp_cnn / "checkpoints" / "final.pth")
print(torchsummary.summary(model_cnn, config.MODEL["kwargs"]["shape_input"]))

evaluator_cnn = Evaluator("svhn_cnn", model_cnn)
evaluator_cnn.evaluate()

print(f"Loss on test data: {evaluator_cnn.log["total"]["loss"]}")
print(f"Metrics on test data")
for name, metrics in evaluator_cnn.log["total"]["metrics"].items():
    print(f"    {name:<10}: {metrics}")
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
├─Sequential: 1-1                        [-1, 128, 4, 4]           --
|    └─BlockCNN2d: 2-1                   [-1, 32, 16, 16]          --
|    |    └─Conv2d: 3-1                  [-1, 32, 32, 32]          2,432
|    |    └─InstanceNorm2d: 3-2          [-1, 32, 32, 32]          --
|    |    └─ReLU: 3-3                    [-1, 32, 32, 32]          --
|    |    └─MaxPool2d: 3-4               [-1, 32, 16, 16]          --
|    └─BlockCNN2d: 2-2                   [-1, 64, 8, 8]            --
|    |    └─Conv2d: 3-5                  [-1, 64, 16, 16]          51,264
|    |    └─InstanceNorm2d: 3-6          [-1, 64, 16, 16]          --
|    |    └─ReLU: 3-7                    [-1, 64, 16, 16]          --
|    |    └─MaxPool2d: 3-8               [-1, 64, 8, 8]            --
|    └─BlockCNN2d: 2-3                   [-1, 128, 4, 4]           --
|    |    └─Conv2d: 3-9                  [-1, 128, 8, 8]           204,928
|    |    └─InstanceNorm2d: 3-10         [-1, 128, 8, 8]           --
|    |    └─ReLU: 3-11                   [-1, 128, 8, 8]           --
|    |    └─MaxPool2d: 3-12              [-1, 128, 4, 4]           --
├─MLP: 1-2                               [-1, 10]                  --
|    └─Sequential: 2-4                   [-1, 10]                  --
|    |    └─Flatten: 3-13                [-1, 2048]                --
|    |    └─Linear: 3-14                 [-1, 10]                  20,490
==========================================================================================
Total params: 279,114
Trainable params: 279,114
Non-trainable params: 0
Total mult-adds (M): 29.25
==========================================================================================
Input size (MB): 0.01
Forward/backward pass size (MB): 0.44
Params size (MB): 1.06
Estimated Total Size (MB): 1.51
==========================================================================================
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
├─Sequential: 1-1                        [-1, 128, 4, 4]           --
|    └─BlockCNN2d: 2-1                   [-1, 32, 16, 16]          --
|    |    └─Conv2d: 3-1                  [-1, 32, 32, 32]          2,432
|    |    └─InstanceNorm2d: 3-2          [-1, 32, 32, 32]          --
|    |    └─ReLU: 3-3                    [-1, 32, 32, 32]          --
|    |    └─MaxPool2d: 3-4               [-1, 32, 16, 16]          --
|    └─BlockCNN2d: 2-2                   [-1, 64, 8, 8]            --
|    |    └─Conv2d: 3-5                  [-1, 64, 16, 16]          51,264
|    |    └─InstanceNorm2d: 3-6          [-1, 64, 16, 16]          --
|    |    └─ReLU: 3-7                    [-1, 64, 16, 16]          --
|    |    └─MaxPool2d: 3-8               [-1, 64, 8, 8]            --
|    └─BlockCNN2d: 2-3                   [-1, 128, 4, 4]           --
|    |    └─Conv2d: 3-9                  [-1, 128, 8, 8]           204,928
|    |    └─InstanceNorm2d: 3-10         [-1, 128, 8, 8]           --
|    |    └─ReLU: 3-11                   [-1, 128, 8, 8]           --
|    |    └─MaxPool2d: 3-12              [-1, 128, 4, 4]           --
├─MLP: 1-2                               [-1, 10]                  --
|    └─Sequential: 2-4                   [-1, 10]                  --
|    |    └─Flatten: 3-13                [-1, 2048]                --
|    |    └─Linear: 3-14                 [-1, 10]                  20,490
==========================================================================================
Total params: 279,114
Trainable params: 279,114
Non-trainable params: 0
Total mult-adds (M): 29.25
==========================================================================================
Input size (MB): 0.01
Forward/backward pass size (MB): 0.44
Params size (MB): 1.06
Estimated Total Size (MB): 1.51
==========================================================================================
Setting up dataloader...
Test dataset
Dataset SVHN
    Number of datapoints: 26032
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: test
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloader finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Validating: Batch 100 | Loss 0.37699: 100%|██████████| 102/102 [00:00<00:00, 102.74it/s]
Loss on test data: 0.4231404592892694
Metrics on test data
    Accuracy  : 0.91329901659496

Discussion¶

The architectures of both models have been printed and discussed above. In summary, the two-hidden-layer MLP has $3428938$ trainable parameters and reaches an accuracy of approx. $0.81$. The fully convolutional CNN has three increasingly wide convolutional layers, only $279,114$ trainable parameters and reaches an accuracy of approx. $0.913$ on the test dataset. It was still much better on the validation set.

In summary, the CNN approach of focusing early on local features and building up a hierarchy of more abstract features is not only much more efficient than the general MLP approach but also performs much better.

Visualization of convolutional kernels (weights) and activations (feature maps)¶

In [ ]:
dataset_test, dataloader_test = utils_data.create_dataset_and_dataloader(split="test")
images = torch.unsqueeze(dataset_test[1][0].detach(), 0)

visualize.visualize_images(images)
No description has been provided for this image

Weights¶

In [ ]:
_, model, _, _ = utils_checkpoints.load(path_dir_exp_cnn / "checkpoints" / "final.pth")
model = utils_model.freeze(model)

names_nodes = []
kernels = {}
for name, module in model.named_modules():
    if isinstance(module, torch.nn.modules.conv.Conv2d):
        names_nodes += [name]
        kernels[name] = module.weight.detach().clone()

print("Convolutional layers")
print(names_nodes)

visualize.visualize_kernels(kernels)
Convolutional layers
['body.0.0', 'body.1.0', 'body.2.0']
No description has been provided for this image

Feature maps¶

In [ ]:
feature_extractor = feature_extraction.create_feature_extractor(model, return_nodes=names_nodes)
featuremaps = feature_extractor(images)

visualize.visualize_featuremaps(featuremaps)
No description has been provided for this image

Discussion¶

What do they represent?

The first figure shows an input image fed into the CNN. The second figure shows a single channel of the kernel weights that have been applied to the input image and feature maps. Lastly, the third figure shows the intermediate feature maps when feeding the input image into the model.

The hierarchical structure of this feedforward network can be observed here. The earliest layer (body.0.0) works as the most low-level feature extractor. One would expect to see lines and other simple structures. Arguably, the kernels in body.2.0 seem less structured but it is not as clear as with the feature maps. They correspond to the more abstract features using features from a larger perceptive field. Finally, this is leading to the semantically meaningful representation as class label prediction.

In accordance to this, the earlier feature maps somewhat resemble the input image with simple features highlighted, such as edges or corners. The second level of feature maps shows patterns which are clearly structured but do not resemble the input image as much anymore.The original image is increasingly obscured in the following layers, as both the receptive area of the neurons and the level of abstraction grow. From more abstract features the classifying head of the model is able to predict the class correctly.

Comparison of regularization methods¶

No regularization¶

See in Training the CNN

$\mathcal{L}_2$ regularization¶

In [ ]:
path_dir_exp_cnn_l2 = Path(config._PATH_DIR_EXPS) / "svhn_cnn_l2"

init_exp.init_exp(name_exp="svhn_cnn_l2", name_config="svhn_cnn_l2")
config.set_config_exp(path_dir_exp_cnn_l2)
Initializing experiment svhn_cnn_l2...
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l2
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l2/checkpoints
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l2/logs
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l2/plots
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l2/visualizations
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/assignment/configs/svhn_cnn_l2.yaml
Config saved to /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l2/config.yaml
Initializing experiment svhn_cnn_l2 finished
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l2/config.yaml
In [ ]:
trainer_cnn_l2 = Trainer("svhn_cnn_l2")
trainer_cnn_l2.loop(config.TRAINING["num_epochs"])
Setting up dataloaders...
Train dataset
Dataset SVHN
    Number of datapoints: 58605
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: train
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Validate dataset
Dataset SVHN
    Number of datapoints: 14652
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: validate
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloaders finished
Setting up model...
Model
CNN2d(
  (body): Sequential(
    (0): BlockCNN2d(
      (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
    (1): BlockCNN2d(
      (0): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
    (2): BlockCNN2d(
      (0): Conv2d(64, 128, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
  )
  (head): MLP(
    (head): Sequential(
      (0): Flatten(start_dim=1, end_dim=-1)
      (1): Linear(in_features=2048, out_features=10, bias=True)
    )
  )
)
Setting up model finished
Setting up optimizer...
Setting up optimizer finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Looping...
Validating: Epoch 000 | Batch 050 | Loss 2.42990: 100%|██████████| 58/58 [00:00<00:00, 85.31it/s] 
Training: Epoch 001 | Batch 220 | Loss 0.52446: 100%|██████████| 229/229 [00:02<00:00, 103.49it/s]
Validating: Epoch 001 | Batch 050 | Loss 0.55977: 100%|██████████| 58/58 [00:00<00:00, 80.95it/s] 
Training: Epoch 002 | Batch 220 | Loss 0.31441: 100%|██████████| 229/229 [00:02<00:00, 108.10it/s]
Validating: Epoch 002 | Batch 050 | Loss 0.40763: 100%|██████████| 58/58 [00:00<00:00, 67.11it/s]
Training: Epoch 003 | Batch 220 | Loss 0.43473: 100%|██████████| 229/229 [00:02<00:00, 106.46it/s]
Validating: Epoch 003 | Batch 050 | Loss 0.34084: 100%|██████████| 58/58 [00:00<00:00, 83.80it/s] 
Training: Epoch 004 | Batch 220 | Loss 0.37728: 100%|██████████| 229/229 [00:02<00:00, 109.36it/s]
Validating: Epoch 004 | Batch 050 | Loss 0.29583: 100%|██████████| 58/58 [00:00<00:00, 88.66it/s] 
Training: Epoch 005 | Batch 220 | Loss 0.35442: 100%|██████████| 229/229 [00:02<00:00, 108.13it/s]
Validating: Epoch 005 | Batch 050 | Loss 0.25626: 100%|██████████| 58/58 [00:00<00:00, 87.52it/s] 
Training: Epoch 006 | Batch 220 | Loss 0.29837: 100%|██████████| 229/229 [00:02<00:00, 109.32it/s]
Validating: Epoch 006 | Batch 050 | Loss 0.22175: 100%|██████████| 58/58 [00:00<00:00, 82.40it/s]
Training: Epoch 007 | Batch 220 | Loss 0.28393: 100%|██████████| 229/229 [00:02<00:00, 109.75it/s]
Validating: Epoch 007 | Batch 050 | Loss 0.21184: 100%|██████████| 58/58 [00:00<00:00, 82.97it/s] 
Training: Epoch 008 | Batch 220 | Loss 0.23449: 100%|██████████| 229/229 [00:02<00:00, 107.30it/s]
Validating: Epoch 008 | Batch 050 | Loss 0.18715: 100%|██████████| 58/58 [00:00<00:00, 87.36it/s] 
Training: Epoch 009 | Batch 220 | Loss 0.15728: 100%|██████████| 229/229 [00:02<00:00, 104.98it/s]
Validating: Epoch 009 | Batch 050 | Loss 0.17908: 100%|██████████| 58/58 [00:00<00:00, 82.76it/s] 
Training: Epoch 010 | Batch 220 | Loss 0.18513: 100%|██████████| 229/229 [00:02<00:00, 107.46it/s]
Validating: Epoch 010 | Batch 050 | Loss 0.16173: 100%|██████████| 58/58 [00:00<00:00, 78.58it/s]
Training: Epoch 011 | Batch 220 | Loss 0.16813: 100%|██████████| 229/229 [00:02<00:00, 107.62it/s]
Validating: Epoch 011 | Batch 050 | Loss 0.15288: 100%|██████████| 58/58 [00:00<00:00, 82.06it/s] 
Training: Epoch 012 | Batch 220 | Loss 0.14595: 100%|██████████| 229/229 [00:02<00:00, 107.87it/s]
Validating: Epoch 012 | Batch 050 | Loss 0.14381: 100%|██████████| 58/58 [00:00<00:00, 82.81it/s] 
Training: Epoch 013 | Batch 220 | Loss 0.08237: 100%|██████████| 229/229 [00:02<00:00, 107.01it/s]
Validating: Epoch 013 | Batch 050 | Loss 0.13259: 100%|██████████| 58/58 [00:00<00:00, 81.27it/s]
Training: Epoch 014 | Batch 220 | Loss 0.13198: 100%|██████████| 229/229 [00:02<00:00, 109.62it/s]
Validating: Epoch 014 | Batch 050 | Loss 0.12212: 100%|██████████| 58/58 [00:00<00:00, 84.22it/s]
Training: Epoch 015 | Batch 220 | Loss 0.14454: 100%|██████████| 229/229 [00:02<00:00, 108.25it/s]
Validating: Epoch 015 | Batch 050 | Loss 0.11230: 100%|██████████| 58/58 [00:00<00:00, 81.77it/s]
Training: Epoch 016 | Batch 220 | Loss 0.14578: 100%|██████████| 229/229 [00:02<00:00, 107.79it/s]
Validating: Epoch 016 | Batch 050 | Loss 0.11464: 100%|██████████| 58/58 [00:00<00:00, 87.01it/s] 
Training: Epoch 017 | Batch 220 | Loss 0.11504: 100%|██████████| 229/229 [00:02<00:00, 107.36it/s]
Validating: Epoch 017 | Batch 050 | Loss 0.12455: 100%|██████████| 58/58 [00:00<00:00, 86.62it/s] 
Training: Epoch 018 | Batch 220 | Loss 0.14839: 100%|██████████| 229/229 [00:02<00:00, 107.84it/s]
Validating: Epoch 018 | Batch 050 | Loss 0.11566: 100%|██████████| 58/58 [00:00<00:00, 87.89it/s] 
Training: Epoch 019 | Batch 220 | Loss 0.16230: 100%|██████████| 229/229 [00:02<00:00, 108.32it/s]
Validating: Epoch 019 | Batch 050 | Loss 0.11882: 100%|██████████| 58/58 [00:00<00:00, 84.47it/s] 
Training: Epoch 020 | Batch 220 | Loss 0.09414: 100%|██████████| 229/229 [00:02<00:00, 108.97it/s]
Validating: Epoch 020 | Batch 050 | Loss 0.10324: 100%|██████████| 58/58 [00:00<00:00, 86.11it/s] 
Training: Epoch 021 | Batch 220 | Loss 0.10726: 100%|██████████| 229/229 [00:02<00:00, 107.53it/s]
Validating: Epoch 021 | Batch 050 | Loss 0.10669: 100%|██████████| 58/58 [00:00<00:00, 79.92it/s]
Training: Epoch 022 | Batch 220 | Loss 0.11384: 100%|██████████| 229/229 [00:02<00:00, 108.29it/s]
Validating: Epoch 022 | Batch 050 | Loss 0.10395: 100%|██████████| 58/58 [00:00<00:00, 85.75it/s] 
Training: Epoch 023 | Batch 220 | Loss 0.08950: 100%|██████████| 229/229 [00:02<00:00, 105.51it/s]
Validating: Epoch 023 | Batch 050 | Loss 0.11108: 100%|██████████| 58/58 [00:00<00:00, 87.71it/s] 
Training: Epoch 024 | Batch 220 | Loss 0.09105: 100%|██████████| 229/229 [00:02<00:00, 108.34it/s]
Validating: Epoch 024 | Batch 050 | Loss 0.09575: 100%|██████████| 58/58 [00:00<00:00, 89.74it/s] 
Training: Epoch 025 | Batch 220 | Loss 0.10922: 100%|██████████| 229/229 [00:02<00:00, 106.13it/s]
Validating: Epoch 025 | Batch 050 | Loss 0.10935: 100%|██████████| 58/58 [00:00<00:00, 80.51it/s]
Training: Epoch 026 | Batch 220 | Loss 0.06860: 100%|██████████| 229/229 [00:02<00:00, 108.12it/s]
Validating: Epoch 026 | Batch 050 | Loss 0.11011: 100%|██████████| 58/58 [00:00<00:00, 82.15it/s]
Training: Epoch 027 | Batch 220 | Loss 0.10157: 100%|██████████| 229/229 [00:02<00:00, 108.66it/s]
Validating: Epoch 027 | Batch 050 | Loss 0.10538: 100%|██████████| 58/58 [00:00<00:00, 81.36it/s]
Training: Epoch 028 | Batch 220 | Loss 0.07934: 100%|██████████| 229/229 [00:02<00:00, 107.94it/s]
Validating: Epoch 028 | Batch 050 | Loss 0.09868: 100%|██████████| 58/58 [00:00<00:00, 81.21it/s] 
Training: Epoch 029 | Batch 220 | Loss 0.06891: 100%|██████████| 229/229 [00:02<00:00, 107.17it/s]
Validating: Epoch 029 | Batch 050 | Loss 0.11004: 100%|██████████| 58/58 [00:00<00:00, 81.69it/s] 
Training: Epoch 030 | Batch 220 | Loss 0.07793: 100%|██████████| 229/229 [00:02<00:00, 107.40it/s]
Validating: Epoch 030 | Batch 050 | Loss 0.10318: 100%|██████████| 58/58 [00:00<00:00, 86.15it/s] 
Training: Epoch 031 | Batch 220 | Loss 0.11467: 100%|██████████| 229/229 [00:02<00:00, 103.31it/s]
Validating: Epoch 031 | Batch 050 | Loss 0.12269: 100%|██████████| 58/58 [00:00<00:00, 83.39it/s]
Training: Epoch 032 | Batch 220 | Loss 0.09932: 100%|██████████| 229/229 [00:02<00:00, 108.57it/s]
Validating: Epoch 032 | Batch 050 | Loss 0.10230: 100%|██████████| 58/58 [00:00<00:00, 85.70it/s] 
Training: Epoch 033 | Batch 220 | Loss 0.09103: 100%|██████████| 229/229 [00:02<00:00, 106.58it/s]
Validating: Epoch 033 | Batch 050 | Loss 0.10118: 100%|██████████| 58/58 [00:00<00:00, 82.74it/s]
Training: Epoch 034 | Batch 220 | Loss 0.06987: 100%|██████████| 229/229 [00:02<00:00, 106.58it/s]
Validating: Epoch 034 | Batch 050 | Loss 0.12222: 100%|██████████| 58/58 [00:00<00:00, 87.75it/s] 
Training: Epoch 035 | Batch 220 | Loss 0.08590: 100%|██████████| 229/229 [00:02<00:00, 107.23it/s]
Validating: Epoch 035 | Batch 050 | Loss 0.10041: 100%|██████████| 58/58 [00:00<00:00, 88.74it/s] 
Training: Epoch 036 | Batch 220 | Loss 0.07218: 100%|██████████| 229/229 [00:02<00:00, 106.96it/s]
Validating: Epoch 036 | Batch 050 | Loss 0.10060: 100%|██████████| 58/58 [00:00<00:00, 85.06it/s] 
Training: Epoch 037 | Batch 220 | Loss 0.08990: 100%|██████████| 229/229 [00:02<00:00, 105.33it/s]
Validating: Epoch 037 | Batch 050 | Loss 0.10241: 100%|██████████| 58/58 [00:00<00:00, 82.99it/s] 
Training: Epoch 038 | Batch 220 | Loss 0.07455: 100%|██████████| 229/229 [00:02<00:00, 106.60it/s]
Validating: Epoch 038 | Batch 050 | Loss 0.10341: 100%|██████████| 58/58 [00:00<00:00, 88.13it/s] 
Training: Epoch 039 | Batch 220 | Loss 0.11110: 100%|██████████| 229/229 [00:02<00:00, 106.46it/s]
Validating: Epoch 039 | Batch 050 | Loss 0.10379: 100%|██████████| 58/58 [00:00<00:00, 80.79it/s]
Training: Epoch 040 | Batch 220 | Loss 0.13026: 100%|██████████| 229/229 [00:02<00:00, 106.77it/s]
Validating: Epoch 040 | Batch 050 | Loss 0.09900: 100%|██████████| 58/58 [00:00<00:00, 85.55it/s] 
Training: Epoch 041 | Batch 220 | Loss 0.07833: 100%|██████████| 229/229 [00:02<00:00, 105.32it/s]
Validating: Epoch 041 | Batch 050 | Loss 0.09524: 100%|██████████| 58/58 [00:00<00:00, 87.36it/s] 
Training: Epoch 042 | Batch 220 | Loss 0.06912: 100%|██████████| 229/229 [00:02<00:00, 107.98it/s]
Validating: Epoch 042 | Batch 050 | Loss 0.10093: 100%|██████████| 58/58 [00:00<00:00, 83.62it/s] 
Training: Epoch 043 | Batch 220 | Loss 0.10167: 100%|██████████| 229/229 [00:02<00:00, 106.55it/s]
Validating: Epoch 043 | Batch 050 | Loss 0.10302: 100%|██████████| 58/58 [00:00<00:00, 88.37it/s] 
Training: Epoch 044 | Batch 220 | Loss 0.08602: 100%|██████████| 229/229 [00:02<00:00, 109.05it/s]
Validating: Epoch 044 | Batch 050 | Loss 0.10418: 100%|██████████| 58/58 [00:00<00:00, 85.25it/s] 
Training: Epoch 045 | Batch 220 | Loss 0.05522: 100%|██████████| 229/229 [00:02<00:00, 106.48it/s]
Validating: Epoch 045 | Batch 050 | Loss 0.10231: 100%|██████████| 58/58 [00:00<00:00, 90.45it/s] 
Training: Epoch 046 | Batch 220 | Loss 0.06209: 100%|██████████| 229/229 [00:02<00:00, 107.26it/s]
Validating: Epoch 046 | Batch 050 | Loss 0.10906: 100%|██████████| 58/58 [00:00<00:00, 85.29it/s] 
Training: Epoch 047 | Batch 220 | Loss 0.06196: 100%|██████████| 229/229 [00:02<00:00, 107.56it/s]
Validating: Epoch 047 | Batch 050 | Loss 0.10790: 100%|██████████| 58/58 [00:00<00:00, 81.66it/s] 
Training: Epoch 048 | Batch 220 | Loss 0.07970: 100%|██████████| 229/229 [00:02<00:00, 106.45it/s]
Validating: Epoch 048 | Batch 050 | Loss 0.10406: 100%|██████████| 58/58 [00:00<00:00, 86.54it/s] 
Training: Epoch 049 | Batch 220 | Loss 0.07192: 100%|██████████| 229/229 [00:02<00:00, 105.68it/s]
Validating: Epoch 049 | Batch 050 | Loss 0.09554: 100%|██████████| 58/58 [00:00<00:00, 83.37it/s] 
Training: Epoch 050 | Batch 220 | Loss 0.07880: 100%|██████████| 229/229 [00:02<00:00, 106.80it/s]
Validating: Epoch 050 | Batch 050 | Loss 0.08433: 100%|██████████| 58/58 [00:00<00:00, 87.85it/s] 
Training: Epoch 051 | Batch 220 | Loss 0.04002: 100%|██████████| 229/229 [00:02<00:00, 109.18it/s]
Validating: Epoch 051 | Batch 050 | Loss 0.09268: 100%|██████████| 58/58 [00:00<00:00, 88.15it/s] 
Training: Epoch 052 | Batch 220 | Loss 0.09367: 100%|██████████| 229/229 [00:02<00:00, 105.21it/s]
Validating: Epoch 052 | Batch 050 | Loss 0.10430: 100%|██████████| 58/58 [00:00<00:00, 88.45it/s] 
Training: Epoch 053 | Batch 220 | Loss 0.09092: 100%|██████████| 229/229 [00:02<00:00, 107.73it/s]
Validating: Epoch 053 | Batch 050 | Loss 0.09965: 100%|██████████| 58/58 [00:00<00:00, 85.97it/s] 
Training: Epoch 054 | Batch 220 | Loss 0.06631: 100%|██████████| 229/229 [00:02<00:00, 107.27it/s]
Validating: Epoch 054 | Batch 050 | Loss 0.11100: 100%|██████████| 58/58 [00:00<00:00, 88.02it/s] 
Training: Epoch 055 | Batch 220 | Loss 0.07466: 100%|██████████| 229/229 [00:02<00:00, 108.19it/s]
Validating: Epoch 055 | Batch 050 | Loss 0.09498: 100%|██████████| 58/58 [00:00<00:00, 83.17it/s] 
Training: Epoch 056 | Batch 220 | Loss 0.06619: 100%|██████████| 229/229 [00:02<00:00, 106.92it/s]
Validating: Epoch 056 | Batch 050 | Loss 0.09900: 100%|██████████| 58/58 [00:00<00:00, 83.21it/s] 
Training: Epoch 057 | Batch 220 | Loss 0.06776: 100%|██████████| 229/229 [00:02<00:00, 107.76it/s]
Validating: Epoch 057 | Batch 050 | Loss 0.09198: 100%|██████████| 58/58 [00:00<00:00, 88.39it/s] 
Training: Epoch 058 | Batch 220 | Loss 0.07378: 100%|██████████| 229/229 [00:02<00:00, 108.53it/s]
Validating: Epoch 058 | Batch 050 | Loss 0.11190: 100%|██████████| 58/58 [00:00<00:00, 90.31it/s] 
Training: Epoch 059 | Batch 220 | Loss 0.05860: 100%|██████████| 229/229 [00:02<00:00, 105.92it/s]
Validating: Epoch 059 | Batch 050 | Loss 0.09230: 100%|██████████| 58/58 [00:00<00:00, 80.90it/s]
Training: Epoch 060 | Batch 220 | Loss 0.08428: 100%|██████████| 229/229 [00:02<00:00, 107.49it/s]
Validating: Epoch 060 | Batch 050 | Loss 0.09873: 100%|██████████| 58/58 [00:00<00:00, 90.72it/s] 
Training: Epoch 061 | Batch 220 | Loss 0.11495: 100%|██████████| 229/229 [00:02<00:00, 107.76it/s]
Validating: Epoch 061 | Batch 050 | Loss 0.11998: 100%|██████████| 58/58 [00:00<00:00, 81.72it/s]
Training: Epoch 062 | Batch 220 | Loss 0.05956: 100%|██████████| 229/229 [00:02<00:00, 108.30it/s]
Validating: Epoch 062 | Batch 050 | Loss 0.09671: 100%|██████████| 58/58 [00:00<00:00, 85.32it/s] 
Training: Epoch 063 | Batch 220 | Loss 0.04647: 100%|██████████| 229/229 [00:02<00:00, 105.55it/s]
Validating: Epoch 063 | Batch 050 | Loss 0.10586: 100%|██████████| 58/58 [00:00<00:00, 83.60it/s] 
Training: Epoch 064 | Batch 220 | Loss 0.05085: 100%|██████████| 229/229 [00:02<00:00, 108.76it/s]
Validating: Epoch 064 | Batch 050 | Loss 0.10031: 100%|██████████| 58/58 [00:00<00:00, 85.46it/s] 
Training: Epoch 065 | Batch 220 | Loss 0.09925: 100%|██████████| 229/229 [00:02<00:00, 106.23it/s]
Validating: Epoch 065 | Batch 050 | Loss 0.09432: 100%|██████████| 58/58 [00:00<00:00, 84.87it/s] 
Training: Epoch 066 | Batch 220 | Loss 0.07004: 100%|██████████| 229/229 [00:02<00:00, 109.24it/s]
Validating: Epoch 066 | Batch 050 | Loss 0.09177: 100%|██████████| 58/58 [00:00<00:00, 81.27it/s]
Training: Epoch 067 | Batch 220 | Loss 0.04566: 100%|██████████| 229/229 [00:02<00:00, 106.68it/s]
Validating: Epoch 067 | Batch 050 | Loss 0.09089: 100%|██████████| 58/58 [00:00<00:00, 79.06it/s]
Training: Epoch 068 | Batch 220 | Loss 0.09814: 100%|██████████| 229/229 [00:02<00:00, 106.92it/s]
Validating: Epoch 068 | Batch 050 | Loss 0.10137: 100%|██████████| 58/58 [00:00<00:00, 84.09it/s] 
Training: Epoch 069 | Batch 220 | Loss 0.06258: 100%|██████████| 229/229 [00:02<00:00, 107.97it/s]
Validating: Epoch 069 | Batch 050 | Loss 0.10273: 100%|██████████| 58/58 [00:00<00:00, 85.83it/s] 
Training: Epoch 070 | Batch 220 | Loss 0.06328: 100%|██████████| 229/229 [00:02<00:00, 107.77it/s]
Validating: Epoch 070 | Batch 050 | Loss 0.10088: 100%|██████████| 58/58 [00:00<00:00, 81.17it/s]
Training: Epoch 071 | Batch 220 | Loss 0.05516: 100%|██████████| 229/229 [00:02<00:00, 107.32it/s]
Validating: Epoch 071 | Batch 050 | Loss 0.09738: 100%|██████████| 58/58 [00:00<00:00, 89.26it/s] 
Training: Epoch 072 | Batch 220 | Loss 0.06198: 100%|██████████| 229/229 [00:02<00:00, 106.03it/s]
Validating: Epoch 072 | Batch 050 | Loss 0.10578: 100%|██████████| 58/58 [00:00<00:00, 84.85it/s] 
Training: Epoch 073 | Batch 220 | Loss 0.09722: 100%|██████████| 229/229 [00:02<00:00, 108.92it/s]
Validating: Epoch 073 | Batch 050 | Loss 0.10030: 100%|██████████| 58/58 [00:00<00:00, 87.00it/s] 
Training: Epoch 074 | Batch 220 | Loss 0.05368: 100%|██████████| 229/229 [00:02<00:00, 108.82it/s]
Validating: Epoch 074 | Batch 050 | Loss 0.08828: 100%|██████████| 58/58 [00:00<00:00, 88.33it/s] 
Training: Epoch 075 | Batch 220 | Loss 0.06746: 100%|██████████| 229/229 [00:02<00:00, 108.72it/s]
Validating: Epoch 075 | Batch 050 | Loss 0.08965: 100%|██████████| 58/58 [00:00<00:00, 89.49it/s] 
Training: Epoch 076 | Batch 220 | Loss 0.06763: 100%|██████████| 229/229 [00:02<00:00, 106.51it/s]
Validating: Epoch 076 | Batch 050 | Loss 0.11091: 100%|██████████| 58/58 [00:00<00:00, 80.57it/s]
Training: Epoch 077 | Batch 220 | Loss 0.06606: 100%|██████████| 229/229 [00:02<00:00, 107.96it/s]
Validating: Epoch 077 | Batch 050 | Loss 0.10814: 100%|██████████| 58/58 [00:00<00:00, 85.61it/s] 
Training: Epoch 078 | Batch 220 | Loss 0.07980: 100%|██████████| 229/229 [00:02<00:00, 106.95it/s]
Validating: Epoch 078 | Batch 050 | Loss 0.08395: 100%|██████████| 58/58 [00:00<00:00, 84.33it/s] 
Training: Epoch 079 | Batch 220 | Loss 0.05193: 100%|██████████| 229/229 [00:02<00:00, 107.78it/s]
Validating: Epoch 079 | Batch 050 | Loss 0.09034: 100%|██████████| 58/58 [00:00<00:00, 83.19it/s]
Training: Epoch 080 | Batch 220 | Loss 0.05757: 100%|██████████| 229/229 [00:02<00:00, 107.90it/s]
Validating: Epoch 080 | Batch 050 | Loss 0.10233: 100%|██████████| 58/58 [00:00<00:00, 88.91it/s] 
Training: Epoch 081 | Batch 220 | Loss 0.06884: 100%|██████████| 229/229 [00:02<00:00, 107.63it/s]
Validating: Epoch 081 | Batch 050 | Loss 0.08545: 100%|██████████| 58/58 [00:00<00:00, 84.33it/s] 
Training: Epoch 082 | Batch 220 | Loss 0.08204: 100%|██████████| 229/229 [00:02<00:00, 106.68it/s]
Validating: Epoch 082 | Batch 050 | Loss 0.10481: 100%|██████████| 58/58 [00:00<00:00, 79.65it/s]
Training: Epoch 083 | Batch 220 | Loss 0.04876: 100%|██████████| 229/229 [00:02<00:00, 105.54it/s]
Validating: Epoch 083 | Batch 050 | Loss 0.08805: 100%|██████████| 58/58 [00:00<00:00, 75.57it/s]
Training: Epoch 084 | Batch 220 | Loss 0.07242: 100%|██████████| 229/229 [00:02<00:00, 108.34it/s]
Validating: Epoch 084 | Batch 050 | Loss 0.10291: 100%|██████████| 58/58 [00:00<00:00, 82.43it/s]
Training: Epoch 085 | Batch 220 | Loss 0.07288: 100%|██████████| 229/229 [00:02<00:00, 108.81it/s]
Validating: Epoch 085 | Batch 050 | Loss 0.09533: 100%|██████████| 58/58 [00:00<00:00, 82.94it/s] 
Training: Epoch 086 | Batch 220 | Loss 0.05905: 100%|██████████| 229/229 [00:02<00:00, 107.27it/s]
Validating: Epoch 086 | Batch 050 | Loss 0.08672: 100%|██████████| 58/58 [00:00<00:00, 78.43it/s]
Training: Epoch 087 | Batch 220 | Loss 0.06330: 100%|██████████| 229/229 [00:02<00:00, 107.13it/s]
Validating: Epoch 087 | Batch 050 | Loss 0.09870: 100%|██████████| 58/58 [00:00<00:00, 88.39it/s] 
Training: Epoch 088 | Batch 220 | Loss 0.06450: 100%|██████████| 229/229 [00:02<00:00, 106.81it/s]
Validating: Epoch 088 | Batch 050 | Loss 0.10767: 100%|██████████| 58/58 [00:00<00:00, 90.46it/s] 
Training: Epoch 089 | Batch 220 | Loss 0.06036: 100%|██████████| 229/229 [00:02<00:00, 108.11it/s]
Validating: Epoch 089 | Batch 050 | Loss 0.08527: 100%|██████████| 58/58 [00:00<00:00, 85.57it/s] 
Training: Epoch 090 | Batch 220 | Loss 0.06252: 100%|██████████| 229/229 [00:02<00:00, 107.90it/s]
Validating: Epoch 090 | Batch 050 | Loss 0.08481: 100%|██████████| 58/58 [00:00<00:00, 86.36it/s] 
Training: Epoch 091 | Batch 220 | Loss 0.06585: 100%|██████████| 229/229 [00:02<00:00, 107.25it/s]
Validating: Epoch 091 | Batch 050 | Loss 0.10139: 100%|██████████| 58/58 [00:00<00:00, 87.56it/s] 
Training: Epoch 092 | Batch 220 | Loss 0.10077: 100%|██████████| 229/229 [00:02<00:00, 106.88it/s]
Validating: Epoch 092 | Batch 050 | Loss 0.11978: 100%|██████████| 58/58 [00:00<00:00, 83.13it/s] 
Training: Epoch 093 | Batch 220 | Loss 0.08114: 100%|██████████| 229/229 [00:02<00:00, 105.91it/s]
Validating: Epoch 093 | Batch 050 | Loss 0.11082: 100%|██████████| 58/58 [00:00<00:00, 86.61it/s] 
Training: Epoch 094 | Batch 220 | Loss 0.07212: 100%|██████████| 229/229 [00:02<00:00, 107.07it/s]
Validating: Epoch 094 | Batch 050 | Loss 0.08629: 100%|██████████| 58/58 [00:00<00:00, 88.34it/s] 
Training: Epoch 095 | Batch 220 | Loss 0.06438: 100%|██████████| 229/229 [00:02<00:00, 107.16it/s]
Validating: Epoch 095 | Batch 050 | Loss 0.09790: 100%|██████████| 58/58 [00:00<00:00, 87.84it/s] 
Training: Epoch 096 | Batch 220 | Loss 0.06214: 100%|██████████| 229/229 [00:02<00:00, 107.50it/s]
Validating: Epoch 096 | Batch 050 | Loss 0.10989: 100%|██████████| 58/58 [00:00<00:00, 86.55it/s] 
Training: Epoch 097 | Batch 220 | Loss 0.07636: 100%|██████████| 229/229 [00:02<00:00, 105.60it/s]
Validating: Epoch 097 | Batch 050 | Loss 0.11383: 100%|██████████| 58/58 [00:00<00:00, 89.21it/s] 
Training: Epoch 098 | Batch 220 | Loss 0.05246: 100%|██████████| 229/229 [00:02<00:00, 108.25it/s]
Validating: Epoch 098 | Batch 050 | Loss 0.08559: 100%|██████████| 58/58 [00:00<00:00, 85.00it/s] 
Training: Epoch 099 | Batch 220 | Loss 0.05456: 100%|██████████| 229/229 [00:02<00:00, 104.73it/s]
Validating: Epoch 099 | Batch 050 | Loss 0.09434: 100%|██████████| 58/58 [00:00<00:00, 87.72it/s] 
Training: Epoch 100 | Batch 220 | Loss 0.06229: 100%|██████████| 229/229 [00:02<00:00, 107.00it/s]
Validating: Epoch 100 | Batch 050 | Loss 0.08763: 100%|██████████| 58/58 [00:00<00:00, 90.47it/s] 
Looping finished

In [ ]:
plot.plot_loss(trainer_cnn_l2.log)
plot.plot_metrics(trainer_cnn_l2.log)
No description has been provided for this image
No description has been provided for this image
In [ ]:
_, model_cnn_l2, _, _ = utils_checkpoints.load(path_dir_exp_cnn_l2 / "checkpoints" / "final.pth")

evaluator_cnn_l2 = Evaluator("svhn_cnn_l2", model_cnn_l2)
evaluator_cnn_l2.evaluate()

print(f"Loss on test data: {evaluator_cnn_l2.log["total"]["loss"]}")
print(f"Metrics on test data")
for name, metrics in evaluator_cnn_l2.log["total"]["metrics"].items():
    print(f"    {name:<10}: {metrics}")
Setting up dataloader...
Test dataset
Dataset SVHN
    Number of datapoints: 26032
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: test
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloader finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Validating: Batch 100 | Loss 20.93285: 100%|██████████| 102/102 [00:01<00:00, 92.96it/s]
Loss on test data: 20.99265132451541
Metrics on test data
    Accuracy  : 0.9183312845728334

$\mathcal{L}_1$ regularization¶

In [ ]:
path_dir_exp_cnn_l1 = Path(config._PATH_DIR_EXPS) / "svhn_cnn_l1"

init_exp.init_exp(name_exp="svhn_cnn_l1", name_config="svhn_cnn_l1")
config.set_config_exp(path_dir_exp_cnn_l1)
Initializing experiment svhn_cnn_l1...
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l1
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l1/checkpoints
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l1/logs
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l1/plots
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l1/visualizations
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/assignment/configs/svhn_cnn_l1.yaml
Config saved to /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l1/config.yaml
Initializing experiment svhn_cnn_l1 finished
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_l1/config.yaml
In [ ]:
trainer_cnn_l1 = Trainer("svhn_cnn_l1")
trainer_cnn_l1.loop(config.TRAINING["num_epochs"])
Setting up dataloaders...
Train dataset
Dataset SVHN
    Number of datapoints: 58605
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: train
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Validate dataset
Dataset SVHN
    Number of datapoints: 14652
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: validate
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloaders finished
Setting up model...
Model
CNN2d(
  (body): Sequential(
    (0): BlockCNN2d(
      (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
    (1): BlockCNN2d(
      (0): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
    (2): BlockCNN2d(
      (0): Conv2d(64, 128, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
  )
  (head): MLP(
    (head): Sequential(
      (0): Flatten(start_dim=1, end_dim=-1)
      (1): Linear(in_features=2048, out_features=10, bias=True)
    )
  )
)
Setting up model finished
Setting up optimizer...
Setting up optimizer finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Looping...
Validating: Epoch 000 | Batch 050 | Loss 78.84438: 100%|██████████| 58/58 [00:00<00:00, 64.61it/s]
Training: Epoch 001 | Batch 220 | Loss 9.25657: 100%|██████████| 229/229 [00:02<00:00, 102.53it/s] 
Validating: Epoch 001 | Batch 050 | Loss 8.23288: 100%|██████████| 58/58 [00:00<00:00, 77.72it/s]
Training: Epoch 002 | Batch 220 | Loss 3.22939: 100%|██████████| 229/229 [00:02<00:00, 101.08it/s]
Validating: Epoch 002 | Batch 050 | Loss 3.20147: 100%|██████████| 58/58 [00:00<00:00, 78.80it/s]
Training: Epoch 003 | Batch 220 | Loss 2.56147: 100%|██████████| 229/229 [00:02<00:00, 103.50it/s]
Validating: Epoch 003 | Batch 050 | Loss 2.57799: 100%|██████████| 58/58 [00:00<00:00, 80.76it/s] 
Training: Epoch 004 | Batch 220 | Loss 2.19739: 100%|██████████| 229/229 [00:02<00:00, 105.00it/s]
Validating: Epoch 004 | Batch 050 | Loss 2.22579: 100%|██████████| 58/58 [00:00<00:00, 84.42it/s]
Training: Epoch 005 | Batch 220 | Loss 2.01790: 100%|██████████| 229/229 [00:02<00:00, 96.58it/s] 
Validating: Epoch 005 | Batch 050 | Loss 2.02310: 100%|██████████| 58/58 [00:00<00:00, 80.80it/s] 
Training: Epoch 006 | Batch 220 | Loss 1.94550: 100%|██████████| 229/229 [00:02<00:00, 103.49it/s]
Validating: Epoch 006 | Batch 050 | Loss 1.95427: 100%|██████████| 58/58 [00:00<00:00, 78.19it/s]
Training: Epoch 007 | Batch 220 | Loss 1.84876: 100%|██████████| 229/229 [00:02<00:00, 102.95it/s]
Validating: Epoch 007 | Batch 050 | Loss 1.82309: 100%|██████████| 58/58 [00:00<00:00, 81.39it/s] 
Training: Epoch 008 | Batch 220 | Loss 1.88964: 100%|██████████| 229/229 [00:02<00:00, 101.61it/s]
Validating: Epoch 008 | Batch 050 | Loss 1.79036: 100%|██████████| 58/58 [00:00<00:00, 80.08it/s] 
Training: Epoch 009 | Batch 220 | Loss 1.71107: 100%|██████████| 229/229 [00:02<00:00, 103.48it/s]
Validating: Epoch 009 | Batch 050 | Loss 1.71424: 100%|██████████| 58/58 [00:00<00:00, 82.92it/s] 
Training: Epoch 010 | Batch 220 | Loss 1.63846: 100%|██████████| 229/229 [00:02<00:00, 100.12it/s]
Validating: Epoch 010 | Batch 050 | Loss 1.65294: 100%|██████████| 58/58 [00:00<00:00, 81.06it/s] 
Training: Epoch 011 | Batch 220 | Loss 1.65542: 100%|██████████| 229/229 [00:02<00:00, 104.54it/s]
Validating: Epoch 011 | Batch 050 | Loss 1.62710: 100%|██████████| 58/58 [00:00<00:00, 80.24it/s] 
Training: Epoch 012 | Batch 220 | Loss 1.57630: 100%|██████████| 229/229 [00:02<00:00, 102.62it/s]
Validating: Epoch 012 | Batch 050 | Loss 1.64625: 100%|██████████| 58/58 [00:00<00:00, 81.44it/s] 
Training: Epoch 013 | Batch 220 | Loss 1.56920: 100%|██████████| 229/229 [00:02<00:00, 102.69it/s]
Validating: Epoch 013 | Batch 050 | Loss 1.59821: 100%|██████████| 58/58 [00:00<00:00, 75.04it/s] 
Training: Epoch 014 | Batch 220 | Loss 1.62764: 100%|██████████| 229/229 [00:02<00:00, 103.95it/s]
Validating: Epoch 014 | Batch 050 | Loss 1.56287: 100%|██████████| 58/58 [00:00<00:00, 76.03it/s]
Training: Epoch 015 | Batch 220 | Loss 1.57860: 100%|██████████| 229/229 [00:02<00:00, 102.09it/s]
Validating: Epoch 015 | Batch 050 | Loss 1.58906: 100%|██████████| 58/58 [00:00<00:00, 82.52it/s] 
Training: Epoch 016 | Batch 220 | Loss 1.53039: 100%|██████████| 229/229 [00:02<00:00, 102.18it/s]
Validating: Epoch 016 | Batch 050 | Loss 1.57327: 100%|██████████| 58/58 [00:00<00:00, 76.82it/s]
Training: Epoch 017 | Batch 220 | Loss 1.52663: 100%|██████████| 229/229 [00:02<00:00, 102.45it/s]
Validating: Epoch 017 | Batch 050 | Loss 1.52597: 100%|██████████| 58/58 [00:00<00:00, 77.42it/s] 
Training: Epoch 018 | Batch 220 | Loss 1.53395: 100%|██████████| 229/229 [00:02<00:00, 102.17it/s]
Validating: Epoch 018 | Batch 050 | Loss 1.51940: 100%|██████████| 58/58 [00:00<00:00, 83.69it/s] 
Training: Epoch 019 | Batch 220 | Loss 1.51497: 100%|██████████| 229/229 [00:02<00:00, 101.32it/s]
Validating: Epoch 019 | Batch 050 | Loss 1.52132: 100%|██████████| 58/58 [00:00<00:00, 83.96it/s] 
Training: Epoch 020 | Batch 220 | Loss 1.50475: 100%|██████████| 229/229 [00:02<00:00, 103.21it/s]
Validating: Epoch 020 | Batch 050 | Loss 1.53533: 100%|██████████| 58/58 [00:00<00:00, 77.73it/s] 
Training: Epoch 021 | Batch 220 | Loss 1.52062: 100%|██████████| 229/229 [00:02<00:00, 102.25it/s]
Validating: Epoch 021 | Batch 050 | Loss 1.50720: 100%|██████████| 58/58 [00:00<00:00, 82.99it/s] 
Training: Epoch 022 | Batch 220 | Loss 1.46932: 100%|██████████| 229/229 [00:02<00:00, 103.09it/s]
Validating: Epoch 022 | Batch 050 | Loss 1.47495: 100%|██████████| 58/58 [00:00<00:00, 82.12it/s] 
Training: Epoch 023 | Batch 220 | Loss 1.41911: 100%|██████████| 229/229 [00:02<00:00, 103.77it/s]
Validating: Epoch 023 | Batch 050 | Loss 1.46717: 100%|██████████| 58/58 [00:00<00:00, 77.95it/s]
Training: Epoch 024 | Batch 220 | Loss 1.45865: 100%|██████████| 229/229 [00:02<00:00, 101.95it/s]
Validating: Epoch 024 | Batch 050 | Loss 1.49449: 100%|██████████| 58/58 [00:00<00:00, 76.36it/s]
Training: Epoch 025 | Batch 220 | Loss 1.44133: 100%|██████████| 229/229 [00:02<00:00, 103.01it/s]
Validating: Epoch 025 | Batch 050 | Loss 1.42666: 100%|██████████| 58/58 [00:00<00:00, 81.96it/s] 
Training: Epoch 026 | Batch 220 | Loss 1.40135: 100%|██████████| 229/229 [00:02<00:00, 102.93it/s]
Validating: Epoch 026 | Batch 050 | Loss 1.50797: 100%|██████████| 58/58 [00:00<00:00, 75.80it/s]
Training: Epoch 027 | Batch 220 | Loss 1.49972: 100%|██████████| 229/229 [00:02<00:00, 102.82it/s]
Validating: Epoch 027 | Batch 050 | Loss 1.46080: 100%|██████████| 58/58 [00:00<00:00, 74.33it/s]
Training: Epoch 028 | Batch 220 | Loss 1.55770: 100%|██████████| 229/229 [00:02<00:00, 104.29it/s]
Validating: Epoch 028 | Batch 050 | Loss 1.49166: 100%|██████████| 58/58 [00:00<00:00, 85.21it/s] 
Training: Epoch 029 | Batch 220 | Loss 1.51058: 100%|██████████| 229/229 [00:02<00:00, 102.59it/s]
Validating: Epoch 029 | Batch 050 | Loss 1.43664: 100%|██████████| 58/58 [00:00<00:00, 82.14it/s] 
Training: Epoch 030 | Batch 220 | Loss 1.38682: 100%|██████████| 229/229 [00:02<00:00, 102.41it/s]
Validating: Epoch 030 | Batch 050 | Loss 1.44733: 100%|██████████| 58/58 [00:00<00:00, 79.34it/s] 
Training: Epoch 031 | Batch 220 | Loss 1.35611: 100%|██████████| 229/229 [00:02<00:00, 103.89it/s]
Validating: Epoch 031 | Batch 050 | Loss 1.46215: 100%|██████████| 58/58 [00:00<00:00, 83.23it/s] 
Training: Epoch 032 | Batch 220 | Loss 1.49130: 100%|██████████| 229/229 [00:02<00:00, 103.24it/s]
Validating: Epoch 032 | Batch 050 | Loss 1.41477: 100%|██████████| 58/58 [00:00<00:00, 84.56it/s] 
Training: Epoch 033 | Batch 220 | Loss 1.42362: 100%|██████████| 229/229 [00:02<00:00, 102.44it/s]
Validating: Epoch 033 | Batch 050 | Loss 1.47204: 100%|██████████| 58/58 [00:00<00:00, 78.82it/s] 
Training: Epoch 034 | Batch 220 | Loss 1.46213: 100%|██████████| 229/229 [00:02<00:00, 103.24it/s]
Validating: Epoch 034 | Batch 050 | Loss 1.43887: 100%|██████████| 58/58 [00:00<00:00, 73.71it/s]
Training: Epoch 035 | Batch 220 | Loss 1.42154: 100%|██████████| 229/229 [00:02<00:00, 103.37it/s]
Validating: Epoch 035 | Batch 050 | Loss 1.41960: 100%|██████████| 58/58 [00:00<00:00, 84.14it/s] 
Training: Epoch 036 | Batch 220 | Loss 1.40063: 100%|██████████| 229/229 [00:02<00:00, 103.46it/s]
Validating: Epoch 036 | Batch 050 | Loss 1.43267: 100%|██████████| 58/58 [00:00<00:00, 83.26it/s] 
Training: Epoch 037 | Batch 220 | Loss 1.31419: 100%|██████████| 229/229 [00:02<00:00, 102.91it/s]
Validating: Epoch 037 | Batch 050 | Loss 1.41832: 100%|██████████| 58/58 [00:00<00:00, 77.06it/s]
Training: Epoch 038 | Batch 220 | Loss 1.42920: 100%|██████████| 229/229 [00:02<00:00, 102.36it/s]
Validating: Epoch 038 | Batch 050 | Loss 1.39364: 100%|██████████| 58/58 [00:00<00:00, 85.01it/s] 
Training: Epoch 039 | Batch 220 | Loss 1.40096: 100%|██████████| 229/229 [00:02<00:00, 103.45it/s]
Validating: Epoch 039 | Batch 050 | Loss 1.46422: 100%|██████████| 58/58 [00:00<00:00, 83.29it/s] 
Training: Epoch 040 | Batch 220 | Loss 1.32629: 100%|██████████| 229/229 [00:02<00:00, 102.23it/s]
Validating: Epoch 040 | Batch 050 | Loss 1.42313: 100%|██████████| 58/58 [00:00<00:00, 80.36it/s] 
Training: Epoch 041 | Batch 220 | Loss 1.37687: 100%|██████████| 229/229 [00:02<00:00, 103.45it/s]
Validating: Epoch 041 | Batch 050 | Loss 1.46212: 100%|██████████| 58/58 [00:00<00:00, 82.41it/s] 
Training: Epoch 042 | Batch 220 | Loss 1.42229: 100%|██████████| 229/229 [00:02<00:00, 102.88it/s]
Validating: Epoch 042 | Batch 050 | Loss 1.40794: 100%|██████████| 58/58 [00:00<00:00, 79.11it/s] 
Training: Epoch 043 | Batch 220 | Loss 1.35377: 100%|██████████| 229/229 [00:02<00:00, 102.23it/s]
Validating: Epoch 043 | Batch 050 | Loss 1.40169: 100%|██████████| 58/58 [00:00<00:00, 77.26it/s]
Training: Epoch 044 | Batch 220 | Loss 1.45695: 100%|██████████| 229/229 [00:02<00:00, 102.53it/s]
Validating: Epoch 044 | Batch 050 | Loss 1.46606: 100%|██████████| 58/58 [00:00<00:00, 80.55it/s] 
Training: Epoch 045 | Batch 220 | Loss 1.44389: 100%|██████████| 229/229 [00:02<00:00, 102.40it/s]
Validating: Epoch 045 | Batch 050 | Loss 1.43905: 100%|██████████| 58/58 [00:00<00:00, 77.30it/s]
Training: Epoch 046 | Batch 220 | Loss 1.37299: 100%|██████████| 229/229 [00:02<00:00, 102.40it/s]
Validating: Epoch 046 | Batch 050 | Loss 1.42579: 100%|██████████| 58/58 [00:00<00:00, 78.56it/s] 
Training: Epoch 047 | Batch 220 | Loss 1.34783: 100%|██████████| 229/229 [00:02<00:00, 103.20it/s]
Validating: Epoch 047 | Batch 050 | Loss 1.39403: 100%|██████████| 58/58 [00:00<00:00, 78.03it/s] 
Training: Epoch 048 | Batch 220 | Loss 1.39765: 100%|██████████| 229/229 [00:02<00:00, 102.63it/s]
Validating: Epoch 048 | Batch 050 | Loss 1.39946: 100%|██████████| 58/58 [00:00<00:00, 86.00it/s] 
Training: Epoch 049 | Batch 220 | Loss 1.36720: 100%|██████████| 229/229 [00:02<00:00, 103.46it/s]
Validating: Epoch 049 | Batch 050 | Loss 1.38164: 100%|██████████| 58/58 [00:00<00:00, 74.97it/s]
Training: Epoch 050 | Batch 220 | Loss 1.40148: 100%|██████████| 229/229 [00:02<00:00, 99.27it/s] 
Validating: Epoch 050 | Batch 050 | Loss 1.43258: 100%|██████████| 58/58 [00:00<00:00, 80.86it/s] 
Training: Epoch 051 | Batch 220 | Loss 1.40761: 100%|██████████| 229/229 [00:02<00:00, 102.45it/s]
Validating: Epoch 051 | Batch 050 | Loss 1.41481: 100%|██████████| 58/58 [00:00<00:00, 84.43it/s] 
Training: Epoch 052 | Batch 220 | Loss 1.41523: 100%|██████████| 229/229 [00:02<00:00, 101.93it/s]
Validating: Epoch 052 | Batch 050 | Loss 1.43238: 100%|██████████| 58/58 [00:00<00:00, 83.95it/s] 
Training: Epoch 053 | Batch 220 | Loss 1.35329: 100%|██████████| 229/229 [00:02<00:00, 103.24it/s]
Validating: Epoch 053 | Batch 050 | Loss 1.38759: 100%|██████████| 58/58 [00:00<00:00, 84.91it/s] 
Training: Epoch 054 | Batch 220 | Loss 1.46726: 100%|██████████| 229/229 [00:02<00:00, 101.82it/s]
Validating: Epoch 054 | Batch 050 | Loss 1.42500: 100%|██████████| 58/58 [00:00<00:00, 80.60it/s] 
Training: Epoch 055 | Batch 220 | Loss 1.35786: 100%|██████████| 229/229 [00:02<00:00, 103.42it/s]
Validating: Epoch 055 | Batch 050 | Loss 1.40969: 100%|██████████| 58/58 [00:00<00:00, 81.34it/s] 
Training: Epoch 056 | Batch 220 | Loss 1.36340: 100%|██████████| 229/229 [00:02<00:00, 102.01it/s]
Validating: Epoch 056 | Batch 050 | Loss 1.41167: 100%|██████████| 58/58 [00:00<00:00, 81.88it/s] 
Training: Epoch 057 | Batch 220 | Loss 1.34330: 100%|██████████| 229/229 [00:02<00:00, 102.74it/s]
Validating: Epoch 057 | Batch 050 | Loss 1.39902: 100%|██████████| 58/58 [00:00<00:00, 82.71it/s] 
Training: Epoch 058 | Batch 220 | Loss 1.40148: 100%|██████████| 229/229 [00:02<00:00, 104.33it/s]
Validating: Epoch 058 | Batch 050 | Loss 1.39730: 100%|██████████| 58/58 [00:00<00:00, 83.31it/s] 
Training: Epoch 059 | Batch 220 | Loss 1.33614: 100%|██████████| 229/229 [00:02<00:00, 102.24it/s]
Validating: Epoch 059 | Batch 050 | Loss 1.37413: 100%|██████████| 58/58 [00:00<00:00, 80.80it/s] 
Training: Epoch 060 | Batch 220 | Loss 1.42272: 100%|██████████| 229/229 [00:02<00:00, 101.66it/s]
Validating: Epoch 060 | Batch 050 | Loss 1.36072: 100%|██████████| 58/58 [00:00<00:00, 81.73it/s] 
Training: Epoch 061 | Batch 220 | Loss 1.37704: 100%|██████████| 229/229 [00:02<00:00, 102.86it/s]
Validating: Epoch 061 | Batch 050 | Loss 1.41515: 100%|██████████| 58/58 [00:00<00:00, 81.64it/s] 
Training: Epoch 062 | Batch 220 | Loss 1.37023: 100%|██████████| 229/229 [00:02<00:00, 102.89it/s]
Validating: Epoch 062 | Batch 050 | Loss 1.41203: 100%|██████████| 58/58 [00:00<00:00, 81.50it/s] 
Training: Epoch 063 | Batch 220 | Loss 1.29426: 100%|██████████| 229/229 [00:02<00:00, 102.26it/s]
Validating: Epoch 063 | Batch 050 | Loss 1.34814: 100%|██████████| 58/58 [00:00<00:00, 81.07it/s] 
Training: Epoch 064 | Batch 220 | Loss 1.28137: 100%|██████████| 229/229 [00:02<00:00, 104.68it/s]
Validating: Epoch 064 | Batch 050 | Loss 1.38885: 100%|██████████| 58/58 [00:00<00:00, 80.61it/s] 
Training: Epoch 065 | Batch 220 | Loss 1.44414: 100%|██████████| 229/229 [00:02<00:00, 100.86it/s]
Validating: Epoch 065 | Batch 050 | Loss 1.37149: 100%|██████████| 58/58 [00:00<00:00, 82.27it/s] 
Training: Epoch 066 | Batch 220 | Loss 1.41152: 100%|██████████| 229/229 [00:02<00:00, 102.82it/s]
Validating: Epoch 066 | Batch 050 | Loss 1.42398: 100%|██████████| 58/58 [00:00<00:00, 79.93it/s] 
Training: Epoch 067 | Batch 220 | Loss 1.33857: 100%|██████████| 229/229 [00:02<00:00, 102.58it/s]
Validating: Epoch 067 | Batch 050 | Loss 1.36157: 100%|██████████| 58/58 [00:00<00:00, 80.68it/s] 
Training: Epoch 068 | Batch 220 | Loss 1.44748: 100%|██████████| 229/229 [00:02<00:00, 103.13it/s]
Validating: Epoch 068 | Batch 050 | Loss 1.42332: 100%|██████████| 58/58 [00:00<00:00, 85.38it/s] 
Training: Epoch 069 | Batch 220 | Loss 1.32457: 100%|██████████| 229/229 [00:02<00:00, 103.52it/s]
Validating: Epoch 069 | Batch 050 | Loss 1.36740: 100%|██████████| 58/58 [00:00<00:00, 77.06it/s]
Training: Epoch 070 | Batch 220 | Loss 1.34762: 100%|██████████| 229/229 [00:02<00:00, 103.88it/s]
Validating: Epoch 070 | Batch 050 | Loss 1.41334: 100%|██████████| 58/58 [00:00<00:00, 73.51it/s]
Training: Epoch 071 | Batch 220 | Loss 1.26996: 100%|██████████| 229/229 [00:02<00:00, 102.79it/s]
Validating: Epoch 071 | Batch 050 | Loss 1.36056: 100%|██████████| 58/58 [00:00<00:00, 78.93it/s] 
Training: Epoch 072 | Batch 220 | Loss 1.31478: 100%|██████████| 229/229 [00:02<00:00, 103.81it/s]
Validating: Epoch 072 | Batch 050 | Loss 1.35397: 100%|██████████| 58/58 [00:00<00:00, 82.21it/s] 
Training: Epoch 073 | Batch 220 | Loss 1.33596: 100%|██████████| 229/229 [00:02<00:00, 102.24it/s]
Validating: Epoch 073 | Batch 050 | Loss 1.36688: 100%|██████████| 58/58 [00:00<00:00, 79.78it/s] 
Training: Epoch 074 | Batch 220 | Loss 1.37466: 100%|██████████| 229/229 [00:02<00:00, 102.11it/s]
Validating: Epoch 074 | Batch 050 | Loss 1.38558: 100%|██████████| 58/58 [00:00<00:00, 79.36it/s] 
Training: Epoch 075 | Batch 220 | Loss 1.34516: 100%|██████████| 229/229 [00:02<00:00, 102.37it/s]
Validating: Epoch 075 | Batch 050 | Loss 1.34851: 100%|██████████| 58/58 [00:00<00:00, 85.86it/s] 
Training: Epoch 076 | Batch 220 | Loss 1.33146: 100%|██████████| 229/229 [00:02<00:00, 103.40it/s]
Validating: Epoch 076 | Batch 050 | Loss 1.34291: 100%|██████████| 58/58 [00:00<00:00, 84.99it/s] 
Training: Epoch 077 | Batch 220 | Loss 1.25763: 100%|██████████| 229/229 [00:02<00:00, 103.11it/s]
Validating: Epoch 077 | Batch 050 | Loss 1.34041: 100%|██████████| 58/58 [00:00<00:00, 77.95it/s] 
Training: Epoch 078 | Batch 220 | Loss 1.40204: 100%|██████████| 229/229 [00:02<00:00, 103.00it/s]
Validating: Epoch 078 | Batch 050 | Loss 1.36802: 100%|██████████| 58/58 [00:00<00:00, 79.25it/s] 
Training: Epoch 079 | Batch 220 | Loss 1.37943: 100%|██████████| 229/229 [00:02<00:00, 102.54it/s]
Validating: Epoch 079 | Batch 050 | Loss 1.41797: 100%|██████████| 58/58 [00:00<00:00, 84.24it/s] 
Training: Epoch 080 | Batch 220 | Loss 1.31690: 100%|██████████| 229/229 [00:02<00:00, 102.61it/s]
Validating: Epoch 080 | Batch 050 | Loss 1.35109: 100%|██████████| 58/58 [00:00<00:00, 80.63it/s] 
Training: Epoch 081 | Batch 220 | Loss 1.29325: 100%|██████████| 229/229 [00:02<00:00, 104.06it/s]
Validating: Epoch 081 | Batch 050 | Loss 1.35389: 100%|██████████| 58/58 [00:00<00:00, 84.11it/s] 
Training: Epoch 082 | Batch 220 | Loss 1.32249: 100%|██████████| 229/229 [00:02<00:00, 103.34it/s]
Validating: Epoch 082 | Batch 050 | Loss 1.37012: 100%|██████████| 58/58 [00:00<00:00, 84.12it/s] 
Training: Epoch 083 | Batch 220 | Loss 1.32176: 100%|██████████| 229/229 [00:02<00:00, 101.42it/s]
Validating: Epoch 083 | Batch 050 | Loss 1.35477: 100%|██████████| 58/58 [00:00<00:00, 81.87it/s] 
Training: Epoch 084 | Batch 220 | Loss 1.37431: 100%|██████████| 229/229 [00:02<00:00, 101.35it/s]
Validating: Epoch 084 | Batch 050 | Loss 1.42024: 100%|██████████| 58/58 [00:00<00:00, 86.57it/s] 
Training: Epoch 085 | Batch 220 | Loss 1.37325: 100%|██████████| 229/229 [00:02<00:00, 104.48it/s]
Validating: Epoch 085 | Batch 050 | Loss 1.36945: 100%|██████████| 58/58 [00:00<00:00, 82.21it/s] 
Training: Epoch 086 | Batch 220 | Loss 1.33600: 100%|██████████| 229/229 [00:02<00:00, 104.16it/s]
Validating: Epoch 086 | Batch 050 | Loss 1.33236: 100%|██████████| 58/58 [00:00<00:00, 86.37it/s] 
Training: Epoch 087 | Batch 220 | Loss 1.28421: 100%|██████████| 229/229 [00:02<00:00, 102.03it/s]
Validating: Epoch 087 | Batch 050 | Loss 1.38404: 100%|██████████| 58/58 [00:00<00:00, 79.33it/s] 
Training: Epoch 088 | Batch 220 | Loss 1.28477: 100%|██████████| 229/229 [00:02<00:00, 101.00it/s]
Validating: Epoch 088 | Batch 050 | Loss 1.34823: 100%|██████████| 58/58 [00:00<00:00, 83.11it/s] 
Training: Epoch 089 | Batch 220 | Loss 1.29153: 100%|██████████| 229/229 [00:02<00:00, 103.26it/s]
Validating: Epoch 089 | Batch 050 | Loss 1.33446: 100%|██████████| 58/58 [00:00<00:00, 81.75it/s] 
Training: Epoch 090 | Batch 220 | Loss 1.30460: 100%|██████████| 229/229 [00:02<00:00, 103.28it/s]
Validating: Epoch 090 | Batch 050 | Loss 1.35934: 100%|██████████| 58/58 [00:00<00:00, 78.22it/s] 
Training: Epoch 091 | Batch 220 | Loss 1.28667: 100%|██████████| 229/229 [00:02<00:00, 102.03it/s]
Validating: Epoch 091 | Batch 050 | Loss 1.36696: 100%|██████████| 58/58 [00:00<00:00, 80.53it/s] 
Training: Epoch 092 | Batch 220 | Loss 1.25019: 100%|██████████| 229/229 [00:02<00:00, 104.34it/s]
Validating: Epoch 092 | Batch 050 | Loss 1.31147: 100%|██████████| 58/58 [00:00<00:00, 76.10it/s]
Training: Epoch 093 | Batch 220 | Loss 1.36544: 100%|██████████| 229/229 [00:02<00:00, 103.16it/s]
Validating: Epoch 093 | Batch 050 | Loss 1.34564: 100%|██████████| 58/58 [00:00<00:00, 78.25it/s]
Training: Epoch 094 | Batch 220 | Loss 1.28783: 100%|██████████| 229/229 [00:02<00:00, 101.72it/s]
Validating: Epoch 094 | Batch 050 | Loss 1.34094: 100%|██████████| 58/58 [00:00<00:00, 86.11it/s] 
Training: Epoch 095 | Batch 220 | Loss 1.36414: 100%|██████████| 229/229 [00:02<00:00, 102.99it/s]
Validating: Epoch 095 | Batch 050 | Loss 1.37341: 100%|██████████| 58/58 [00:00<00:00, 85.76it/s] 
Training: Epoch 096 | Batch 220 | Loss 1.29023: 100%|██████████| 229/229 [00:02<00:00, 102.73it/s]
Validating: Epoch 096 | Batch 050 | Loss 1.40810: 100%|██████████| 58/58 [00:00<00:00, 78.27it/s]
Training: Epoch 097 | Batch 220 | Loss 1.36055: 100%|██████████| 229/229 [00:02<00:00, 103.92it/s]
Validating: Epoch 097 | Batch 050 | Loss 1.39314: 100%|██████████| 58/58 [00:00<00:00, 78.68it/s]
Training: Epoch 098 | Batch 220 | Loss 1.39010: 100%|██████████| 229/229 [00:02<00:00, 101.86it/s]
Validating: Epoch 098 | Batch 050 | Loss 1.36179: 100%|██████████| 58/58 [00:00<00:00, 83.41it/s] 
Training: Epoch 099 | Batch 220 | Loss 1.26612: 100%|██████████| 229/229 [00:02<00:00, 101.55it/s]
Validating: Epoch 099 | Batch 050 | Loss 1.34806: 100%|██████████| 58/58 [00:00<00:00, 76.28it/s]
Training: Epoch 100 | Batch 220 | Loss 1.37650: 100%|██████████| 229/229 [00:02<00:00, 103.06it/s]
Validating: Epoch 100 | Batch 050 | Loss 1.33071: 100%|██████████| 58/58 [00:00<00:00, 77.72it/s] 
Looping finished

In [ ]:
plot.plot_loss(trainer_cnn_l1.log)
plot.plot_metrics(trainer_cnn_l1.log)
No description has been provided for this image
No description has been provided for this image
In [ ]:
_, model_cnn_l1, _, _ = utils_checkpoints.load(path_dir_exp_cnn_l1 / "checkpoints" / "final.pth")

evaluator_cnn_l1 = Evaluator("svhn_cnn_l1", model_cnn_l1)
evaluator_cnn_l1.evaluate()

print(f"Loss on test data: {evaluator_cnn_l1.log["total"]["loss"]}")
print(f"Metrics on test data")
for name, metrics in evaluator_cnn_l1.log["total"]["metrics"].items():
    print(f"    {name:<10}: {metrics}")
Setting up dataloader...
Test dataset
Dataset SVHN
    Number of datapoints: 26032
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: test
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloader finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Validating: Batch 100 | Loss 1.30799: 100%|██████████| 102/102 [00:01<00:00, 89.07it/s] 
Loss on test data: 1.31426691312913
Metrics on test data
    Accuracy  : 0.8824523663183774

Discussion¶

The $\mathcal{L}_2$ regularization is easily implemented (at least after I put so much work in before). In the corresponding config, a weight decay argument is added to the optimizer which implements the regularization. One could have easily implemented the $\mathcal{L}_2$ just like the $\mathcal{L}_1$ norm, but this is easier. The $\mathcal{L}_1$ norm is hard-coded into the trainer.py.

When using the $\mathcal{L}_2$ norm one can see in the logarithmic loss plot that the overfitting is reduced. Without regularization the loss on the training data is still exponentially descreasing while the validation loss stagnates. This is handled better using the regularization. The final accuracy on the test dataset is also increased ($0.918$ vs. $0.913$). The smaller network weight force the model to generalize better.

When using the $\mathcal{L}_1$ the overfitting effect between the train and validation datasets completely vanishes. However, the loss and the accuracy do not progress as far as for the other two cases. The sparsity of the weights does have a negative effect on the performance. I also tried a few other values for weighting the regularization term, but I did not achieve both regularization and improvement of performance.

Compare the results: training and validation loss, accuracy, ...

Evaluation of custom learning rate warmup and learning rate scheduler¶

In [ ]:
class CustomSchedulerWithWarmup:
    """Not even integrated into my Python package since I will probably never use it again.
    It seems to me that Pytorch is already super flexible when using combinations of its schedulers."""

    def __init__(self, optimizer, steps_warmup=10, slope_warmup=0.1, factor_reduction=0.7, patience=7):
        self.counter_step = None
        self.counter_wait = None
        self.lr_current = None
        self.lr_initial = None
        self.log = None
        self.factor_reduction = factor_reduction
        self.loss_best = None
        self.optimizer = optimizer
        self.patience = patience
        self.slope_warmup = slope_warmup
        self.steps_warmup = steps_warmup

        self._initialize()

    def _initialize(self):
        self.counter_step = 0
        self.counter_wait = 0
        self.log = []
        self.lr_initial = self.optimizer.param_groups[0]["lr"]
        self.lr_current = self.lr_initial

    def update_lr(self, step, loss):
        if step <= self.steps_warmup:
            self.lr_current = self.lr_initial * (1 + self.slope_warmup * (step - self.steps_warmup) + 0.2 * np.cos(step * self.steps_warmup * 2 * np.pi))
        else:
            if self.loss_best is None or loss < self.loss_best:
                self.loss_best = loss
                self.counter_wait = 0
            else:
                self.counter_wait += 1

        if self.counter_wait >= self.patience:
            self.lr_current *= self.factor_reduction
            self.counter_wait = 0

    def step(self, loss):
        self.counter_step += 1
        self.update_lr(self.counter_step, loss)
        self.optimizer.param_groups[0]["lr"] = self.lr_current
        self.log += [self.lr_current]

    def state_dict(self):
        # This line I stole from pytorch.
        return {key: value for key, value in self.__dict__.items() if key != "optimizer"}

Training¶

In [ ]:
path_dir_exp_cnn_scheduling = Path(config._PATH_DIR_EXPS) / "svhn_cnn_scheduling"

init_exp.init_exp(name_exp="svhn_cnn_scheduling", name_config="svhn_cnn")
config.set_config_exp(path_dir_exp_cnn_scheduling)
Initializing experiment svhn_cnn_scheduling...
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_scheduling
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_scheduling/checkpoints
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_scheduling/logs
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_scheduling/plots
Created directory /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_scheduling/visualizations
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/assignment/configs/svhn_cnn.yaml
Config saved to /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_scheduling/config.yaml
Initializing experiment svhn_cnn_scheduling finished
Config loaded from /home/user/karacora/lab-vision-systems-assignments/assignment_2/experiments/svhn_cnn_scheduling/config.yaml
In [ ]:
trainer_cnn_scheduling = Trainer("svhn_cnn_scheduling")
trainer_cnn_scheduling.scheduler = CustomSchedulerWithWarmup(trainer_cnn_scheduling.optimizer)
trainer_cnn_scheduling.loop(config.TRAINING["num_epochs"])
Setting up dataloaders...
Train dataset
Dataset SVHN
    Number of datapoints: 58605
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: train
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Validate dataset
Dataset SVHN
    Number of datapoints: 14652
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: validate
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloaders finished
Setting up model...
Model
CNN2d(
  (body): Sequential(
    (0): BlockCNN2d(
      (0): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
    (1): BlockCNN2d(
      (0): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
    (2): BlockCNN2d(
      (0): Conv2d(64, 128, kernel_size=(5, 5), stride=(1, 1), padding=same, padding_mode=reflect)
      (1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      (2): ReLU()
      (3): MaxPool2d(kernel_size=[2, 2], stride=[2, 2], padding=0, dilation=1, ceil_mode=False)
    )
  )
  (head): MLP(
    (head): Sequential(
      (0): Flatten(start_dim=1, end_dim=-1)
      (1): Linear(in_features=2048, out_features=10, bias=True)
    )
  )
)
Setting up model finished
Setting up optimizer...
Setting up optimizer finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Looping...
Validating: Epoch 000 | Batch 050 | Loss 2.40008: 100%|██████████| 58/58 [00:00<00:00, 96.36it/s] 
Training: Epoch 001 | Batch 220 | Loss 0.61319: 100%|██████████| 229/229 [00:02<00:00, 108.97it/s]
Validating: Epoch 001 | Batch 050 | Loss 0.57945: 100%|██████████| 58/58 [00:00<00:00, 87.74it/s] 
Training: Epoch 002 | Batch 220 | Loss 0.49074: 100%|██████████| 229/229 [00:02<00:00, 111.16it/s]
Validating: Epoch 002 | Batch 050 | Loss 0.49101: 100%|██████████| 58/58 [00:00<00:00, 98.64it/s] 
Training: Epoch 003 | Batch 220 | Loss 0.45451: 100%|██████████| 229/229 [00:02<00:00, 111.04it/s]
Validating: Epoch 003 | Batch 050 | Loss 0.44596: 100%|██████████| 58/58 [00:00<00:00, 68.86it/s]
Training: Epoch 004 | Batch 220 | Loss 0.36416: 100%|██████████| 229/229 [00:02<00:00, 108.68it/s]
Validating: Epoch 004 | Batch 050 | Loss 0.39381: 100%|██████████| 58/58 [00:00<00:00, 85.07it/s] 
Training: Epoch 005 | Batch 220 | Loss 0.29543: 100%|██████████| 229/229 [00:02<00:00, 108.73it/s]
Validating: Epoch 005 | Batch 050 | Loss 0.36373: 100%|██████████| 58/58 [00:00<00:00, 93.58it/s] 
Training: Epoch 006 | Batch 220 | Loss 0.35149: 100%|██████████| 229/229 [00:02<00:00, 109.76it/s]
Validating: Epoch 006 | Batch 050 | Loss 0.33600: 100%|██████████| 58/58 [00:00<00:00, 84.66it/s]
Training: Epoch 007 | Batch 220 | Loss 0.36603: 100%|██████████| 229/229 [00:02<00:00, 108.43it/s]
Validating: Epoch 007 | Batch 050 | Loss 0.29818: 100%|██████████| 58/58 [00:00<00:00, 91.17it/s] 
Training: Epoch 008 | Batch 220 | Loss 0.29147: 100%|██████████| 229/229 [00:02<00:00, 109.59it/s]
Validating: Epoch 008 | Batch 050 | Loss 0.28720: 100%|██████████| 58/58 [00:00<00:00, 88.67it/s]
Training: Epoch 009 | Batch 220 | Loss 0.28753: 100%|██████████| 229/229 [00:02<00:00, 109.75it/s]
Validating: Epoch 009 | Batch 050 | Loss 0.25453: 100%|██████████| 58/58 [00:00<00:00, 100.14it/s]
Training: Epoch 010 | Batch 220 | Loss 0.20174: 100%|██████████| 229/229 [00:02<00:00, 107.38it/s]
Validating: Epoch 010 | Batch 050 | Loss 0.23657: 100%|██████████| 58/58 [00:00<00:00, 87.52it/s]
Training: Epoch 011 | Batch 220 | Loss 0.19309: 100%|██████████| 229/229 [00:02<00:00, 108.61it/s]
Validating: Epoch 011 | Batch 050 | Loss 0.21063: 100%|██████████| 58/58 [00:00<00:00, 87.34it/s]
Training: Epoch 012 | Batch 220 | Loss 0.16411: 100%|██████████| 229/229 [00:02<00:00, 109.40it/s]
Validating: Epoch 012 | Batch 050 | Loss 0.19093: 100%|██████████| 58/58 [00:00<00:00, 91.66it/s] 
Training: Epoch 013 | Batch 220 | Loss 0.14515: 100%|██████████| 229/229 [00:02<00:00, 107.52it/s]
Validating: Epoch 013 | Batch 050 | Loss 0.17862: 100%|██████████| 58/58 [00:00<00:00, 87.53it/s]
Training: Epoch 014 | Batch 220 | Loss 0.16386: 100%|██████████| 229/229 [00:02<00:00, 107.70it/s]
Validating: Epoch 014 | Batch 050 | Loss 0.16864: 100%|██████████| 58/58 [00:00<00:00, 95.60it/s] 
Training: Epoch 015 | Batch 220 | Loss 0.09006: 100%|██████████| 229/229 [00:02<00:00, 105.58it/s]
Validating: Epoch 015 | Batch 050 | Loss 0.15160: 100%|██████████| 58/58 [00:00<00:00, 95.44it/s] 
Training: Epoch 016 | Batch 220 | Loss 0.07548: 100%|██████████| 229/229 [00:02<00:00, 103.49it/s]
Validating: Epoch 016 | Batch 050 | Loss 0.13373: 100%|██████████| 58/58 [00:00<00:00, 96.15it/s] 
Training: Epoch 017 | Batch 220 | Loss 0.13506: 100%|██████████| 229/229 [00:02<00:00, 109.07it/s]
Validating: Epoch 017 | Batch 050 | Loss 0.11681: 100%|██████████| 58/58 [00:00<00:00, 93.66it/s] 
Training: Epoch 018 | Batch 220 | Loss 0.09107: 100%|██████████| 229/229 [00:02<00:00, 106.21it/s]
Validating: Epoch 018 | Batch 050 | Loss 0.11567: 100%|██████████| 58/58 [00:00<00:00, 99.35it/s] 
Training: Epoch 019 | Batch 220 | Loss 0.06846: 100%|██████████| 229/229 [00:02<00:00, 108.03it/s]
Validating: Epoch 019 | Batch 050 | Loss 0.10840: 100%|██████████| 58/58 [00:00<00:00, 86.28it/s]
Training: Epoch 020 | Batch 220 | Loss 0.12095: 100%|██████████| 229/229 [00:02<00:00, 109.19it/s]
Validating: Epoch 020 | Batch 050 | Loss 0.09499: 100%|██████████| 58/58 [00:00<00:00, 92.09it/s] 
Training: Epoch 021 | Batch 220 | Loss 0.06605: 100%|██████████| 229/229 [00:02<00:00, 108.83it/s]
Validating: Epoch 021 | Batch 050 | Loss 0.08734: 100%|██████████| 58/58 [00:00<00:00, 90.51it/s] 
Training: Epoch 022 | Batch 220 | Loss 0.06213: 100%|██████████| 229/229 [00:02<00:00, 108.38it/s]
Validating: Epoch 022 | Batch 050 | Loss 0.07899: 100%|██████████| 58/58 [00:00<00:00, 89.89it/s] 
Training: Epoch 023 | Batch 220 | Loss 0.07434: 100%|██████████| 229/229 [00:02<00:00, 108.42it/s]
Validating: Epoch 023 | Batch 050 | Loss 0.07262: 100%|██████████| 58/58 [00:00<00:00, 94.59it/s] 
Training: Epoch 024 | Batch 220 | Loss 0.05275: 100%|██████████| 229/229 [00:02<00:00, 110.00it/s]
Validating: Epoch 024 | Batch 050 | Loss 0.07152: 100%|██████████| 58/58 [00:00<00:00, 92.49it/s] 
Training: Epoch 025 | Batch 220 | Loss 0.02584: 100%|██████████| 229/229 [00:02<00:00, 108.59it/s]
Validating: Epoch 025 | Batch 050 | Loss 0.05866: 100%|██████████| 58/58 [00:00<00:00, 96.12it/s] 
Training: Epoch 026 | Batch 220 | Loss 0.03153: 100%|██████████| 229/229 [00:02<00:00, 109.60it/s]
Validating: Epoch 026 | Batch 050 | Loss 0.06300: 100%|██████████| 58/58 [00:00<00:00, 96.78it/s] 
Training: Epoch 027 | Batch 220 | Loss 0.02516: 100%|██████████| 229/229 [00:02<00:00, 108.03it/s]
Validating: Epoch 027 | Batch 050 | Loss 0.06382: 100%|██████████| 58/58 [00:00<00:00, 91.00it/s] 
Training: Epoch 028 | Batch 220 | Loss 0.02111: 100%|██████████| 229/229 [00:02<00:00, 108.80it/s]
Validating: Epoch 028 | Batch 050 | Loss 0.05341: 100%|██████████| 58/58 [00:00<00:00, 92.55it/s] 
Training: Epoch 029 | Batch 220 | Loss 0.02003: 100%|██████████| 229/229 [00:02<00:00, 110.13it/s]
Validating: Epoch 029 | Batch 050 | Loss 0.05938: 100%|██████████| 58/58 [00:00<00:00, 90.27it/s] 
Training: Epoch 030 | Batch 220 | Loss 0.01948: 100%|██████████| 229/229 [00:02<00:00, 110.43it/s]
Validating: Epoch 030 | Batch 050 | Loss 0.05560: 100%|██████████| 58/58 [00:00<00:00, 89.61it/s] 
Training: Epoch 031 | Batch 220 | Loss 0.01995: 100%|██████████| 229/229 [00:02<00:00, 110.31it/s]
Validating: Epoch 031 | Batch 050 | Loss 0.06163: 100%|██████████| 58/58 [00:00<00:00, 95.82it/s] 
Training: Epoch 032 | Batch 220 | Loss 0.00952: 100%|██████████| 229/229 [00:02<00:00, 110.07it/s]
Validating: Epoch 032 | Batch 050 | Loss 0.05159: 100%|██████████| 58/58 [00:00<00:00, 94.29it/s] 
Training: Epoch 033 | Batch 220 | Loss 0.00828: 100%|██████████| 229/229 [00:02<00:00, 106.62it/s]
Validating: Epoch 033 | Batch 050 | Loss 0.05969: 100%|██████████| 58/58 [00:00<00:00, 90.31it/s]
Training: Epoch 034 | Batch 220 | Loss 0.01128: 100%|██████████| 229/229 [00:02<00:00, 107.60it/s]
Validating: Epoch 034 | Batch 050 | Loss 0.04494: 100%|██████████| 58/58 [00:00<00:00, 95.27it/s] 
Training: Epoch 035 | Batch 220 | Loss 0.00842: 100%|██████████| 229/229 [00:02<00:00, 109.32it/s]
Validating: Epoch 035 | Batch 050 | Loss 0.04919: 100%|██████████| 58/58 [00:00<00:00, 87.63it/s]
Training: Epoch 036 | Batch 220 | Loss 0.00863: 100%|██████████| 229/229 [00:02<00:00, 111.11it/s]
Validating: Epoch 036 | Batch 050 | Loss 0.05314: 100%|██████████| 58/58 [00:00<00:00, 87.59it/s]
Training: Epoch 037 | Batch 220 | Loss 0.01062: 100%|██████████| 229/229 [00:02<00:00, 108.61it/s]
Validating: Epoch 037 | Batch 050 | Loss 0.05759: 100%|██████████| 58/58 [00:00<00:00, 94.34it/s] 
Training: Epoch 038 | Batch 220 | Loss 0.00610: 100%|██████████| 229/229 [00:02<00:00, 106.76it/s]
Validating: Epoch 038 | Batch 050 | Loss 0.05115: 100%|██████████| 58/58 [00:00<00:00, 98.87it/s] 
Training: Epoch 039 | Batch 220 | Loss 0.05201: 100%|██████████| 229/229 [00:02<00:00, 107.58it/s]
Validating: Epoch 039 | Batch 050 | Loss 0.06266: 100%|██████████| 58/58 [00:00<00:00, 91.91it/s] 
Training: Epoch 040 | Batch 220 | Loss 0.00876: 100%|██████████| 229/229 [00:02<00:00, 108.81it/s]
Validating: Epoch 040 | Batch 050 | Loss 0.03995: 100%|██████████| 58/58 [00:00<00:00, 94.90it/s] 
Training: Epoch 041 | Batch 220 | Loss 0.00368: 100%|██████████| 229/229 [00:02<00:00, 109.76it/s]
Validating: Epoch 041 | Batch 050 | Loss 0.04703: 100%|██████████| 58/58 [00:00<00:00, 91.59it/s] 
Training: Epoch 042 | Batch 220 | Loss 0.00314: 100%|██████████| 229/229 [00:02<00:00, 106.25it/s]
Validating: Epoch 042 | Batch 050 | Loss 0.04848: 100%|██████████| 58/58 [00:00<00:00, 92.86it/s] 
Training: Epoch 043 | Batch 220 | Loss 0.00345: 100%|██████████| 229/229 [00:02<00:00, 108.50it/s]
Validating: Epoch 043 | Batch 050 | Loss 0.05043: 100%|██████████| 58/58 [00:00<00:00, 94.66it/s] 
Training: Epoch 044 | Batch 220 | Loss 0.00230: 100%|██████████| 229/229 [00:02<00:00, 108.71it/s]
Validating: Epoch 044 | Batch 050 | Loss 0.05125: 100%|██████████| 58/58 [00:00<00:00, 86.56it/s]
Training: Epoch 045 | Batch 220 | Loss 0.00303: 100%|██████████| 229/229 [00:02<00:00, 108.53it/s]
Validating: Epoch 045 | Batch 050 | Loss 0.05069: 100%|██████████| 58/58 [00:00<00:00, 89.21it/s] 
Training: Epoch 046 | Batch 220 | Loss 0.00295: 100%|██████████| 229/229 [00:02<00:00, 108.38it/s]
Validating: Epoch 046 | Batch 050 | Loss 0.05173: 100%|██████████| 58/58 [00:00<00:00, 94.04it/s] 
Training: Epoch 047 | Batch 220 | Loss 0.00383: 100%|██████████| 229/229 [00:02<00:00, 108.87it/s]
Validating: Epoch 047 | Batch 050 | Loss 0.04341: 100%|██████████| 58/58 [00:00<00:00, 87.86it/s]
Training: Epoch 048 | Batch 220 | Loss 0.00280: 100%|██████████| 229/229 [00:02<00:00, 106.95it/s]
Validating: Epoch 048 | Batch 050 | Loss 0.05158: 100%|██████████| 58/58 [00:00<00:00, 92.08it/s] 
Training: Epoch 049 | Batch 220 | Loss 0.00145: 100%|██████████| 229/229 [00:02<00:00, 107.63it/s]
Validating: Epoch 049 | Batch 050 | Loss 0.05330: 100%|██████████| 58/58 [00:00<00:00, 100.74it/s]
Training: Epoch 050 | Batch 220 | Loss 0.00181: 100%|██████████| 229/229 [00:02<00:00, 109.84it/s]
Validating: Epoch 050 | Batch 050 | Loss 0.05318: 100%|██████████| 58/58 [00:00<00:00, 96.26it/s] 
Training: Epoch 051 | Batch 220 | Loss 0.00149: 100%|██████████| 229/229 [00:02<00:00, 108.10it/s]
Validating: Epoch 051 | Batch 050 | Loss 0.05521: 100%|██████████| 58/58 [00:00<00:00, 95.63it/s] 
Training: Epoch 052 | Batch 220 | Loss 0.00150: 100%|██████████| 229/229 [00:02<00:00, 109.74it/s]
Validating: Epoch 052 | Batch 050 | Loss 0.05355: 100%|██████████| 58/58 [00:00<00:00, 99.11it/s] 
Training: Epoch 053 | Batch 220 | Loss 0.00223: 100%|██████████| 229/229 [00:02<00:00, 109.34it/s]
Validating: Epoch 053 | Batch 050 | Loss 0.06509: 100%|██████████| 58/58 [00:00<00:00, 98.82it/s] 
Training: Epoch 054 | Batch 220 | Loss 0.00176: 100%|██████████| 229/229 [00:02<00:00, 107.86it/s]
Validating: Epoch 054 | Batch 050 | Loss 0.06260: 100%|██████████| 58/58 [00:00<00:00, 88.15it/s] 
Training: Epoch 055 | Batch 220 | Loss 0.00289: 100%|██████████| 229/229 [00:02<00:00, 109.62it/s]
Validating: Epoch 055 | Batch 050 | Loss 0.05712: 100%|██████████| 58/58 [00:00<00:00, 97.80it/s] 
Training: Epoch 056 | Batch 220 | Loss 0.00142: 100%|██████████| 229/229 [00:02<00:00, 109.74it/s]
Validating: Epoch 056 | Batch 050 | Loss 0.05791: 100%|██████████| 58/58 [00:00<00:00, 83.37it/s]
Training: Epoch 057 | Batch 220 | Loss 0.00126: 100%|██████████| 229/229 [00:02<00:00, 110.32it/s]
Validating: Epoch 057 | Batch 050 | Loss 0.05436: 100%|██████████| 58/58 [00:00<00:00, 95.22it/s] 
Training: Epoch 058 | Batch 220 | Loss 0.00148: 100%|██████████| 229/229 [00:02<00:00, 108.14it/s]
Validating: Epoch 058 | Batch 050 | Loss 0.05502: 100%|██████████| 58/58 [00:00<00:00, 91.56it/s] 
Training: Epoch 059 | Batch 220 | Loss 0.00120: 100%|██████████| 229/229 [00:02<00:00, 108.36it/s]
Validating: Epoch 059 | Batch 050 | Loss 0.05656: 100%|██████████| 58/58 [00:00<00:00, 91.76it/s] 
Training: Epoch 060 | Batch 220 | Loss 0.00119: 100%|██████████| 229/229 [00:02<00:00, 109.05it/s]
Validating: Epoch 060 | Batch 050 | Loss 0.05563: 100%|██████████| 58/58 [00:00<00:00, 98.59it/s] 
Training: Epoch 061 | Batch 220 | Loss 0.00088: 100%|██████████| 229/229 [00:02<00:00, 110.58it/s]
Validating: Epoch 061 | Batch 050 | Loss 0.05399: 100%|██████████| 58/58 [00:00<00:00, 85.35it/s]
Training: Epoch 062 | Batch 220 | Loss 0.04155: 100%|██████████| 229/229 [00:02<00:00, 109.84it/s]
Validating: Epoch 062 | Batch 050 | Loss 0.08105: 100%|██████████| 58/58 [00:00<00:00, 92.96it/s] 
Training: Epoch 063 | Batch 220 | Loss 0.00139: 100%|██████████| 229/229 [00:02<00:00, 107.31it/s]
Validating: Epoch 063 | Batch 050 | Loss 0.05785: 100%|██████████| 58/58 [00:00<00:00, 90.26it/s] 
Training: Epoch 064 | Batch 220 | Loss 0.00106: 100%|██████████| 229/229 [00:02<00:00, 108.35it/s]
Validating: Epoch 064 | Batch 050 | Loss 0.05983: 100%|██████████| 58/58 [00:00<00:00, 95.38it/s] 
Training: Epoch 065 | Batch 220 | Loss 0.00084: 100%|██████████| 229/229 [00:02<00:00, 108.09it/s]
Validating: Epoch 065 | Batch 050 | Loss 0.05406: 100%|██████████| 58/58 [00:00<00:00, 102.68it/s]
Training: Epoch 066 | Batch 220 | Loss 0.00096: 100%|██████████| 229/229 [00:02<00:00, 109.33it/s]
Validating: Epoch 066 | Batch 050 | Loss 0.05837: 100%|██████████| 58/58 [00:00<00:00, 94.57it/s] 
Training: Epoch 067 | Batch 220 | Loss 0.00068: 100%|██████████| 229/229 [00:02<00:00, 108.13it/s]
Validating: Epoch 067 | Batch 050 | Loss 0.06202: 100%|██████████| 58/58 [00:00<00:00, 90.16it/s] 
Training: Epoch 068 | Batch 220 | Loss 0.00077: 100%|██████████| 229/229 [00:02<00:00, 108.75it/s]
Validating: Epoch 068 | Batch 050 | Loss 0.05763: 100%|██████████| 58/58 [00:00<00:00, 94.61it/s] 
Training: Epoch 069 | Batch 220 | Loss 0.00057: 100%|██████████| 229/229 [00:02<00:00, 108.67it/s]
Validating: Epoch 069 | Batch 050 | Loss 0.05804: 100%|██████████| 58/58 [00:00<00:00, 94.29it/s] 
Training: Epoch 070 | Batch 220 | Loss 0.00062: 100%|██████████| 229/229 [00:02<00:00, 109.81it/s]
Validating: Epoch 070 | Batch 050 | Loss 0.05918: 100%|██████████| 58/58 [00:00<00:00, 91.56it/s] 
Training: Epoch 071 | Batch 220 | Loss 0.00070: 100%|██████████| 229/229 [00:02<00:00, 109.25it/s]
Validating: Epoch 071 | Batch 050 | Loss 0.06033: 100%|██████████| 58/58 [00:00<00:00, 100.21it/s]
Training: Epoch 072 | Batch 220 | Loss 0.00084: 100%|██████████| 229/229 [00:02<00:00, 109.45it/s]
Validating: Epoch 072 | Batch 050 | Loss 0.06379: 100%|██████████| 58/58 [00:00<00:00, 99.16it/s] 
Training: Epoch 073 | Batch 220 | Loss 0.00065: 100%|██████████| 229/229 [00:02<00:00, 108.39it/s]
Validating: Epoch 073 | Batch 050 | Loss 0.05624: 100%|██████████| 58/58 [00:00<00:00, 88.49it/s]
Training: Epoch 074 | Batch 220 | Loss 0.00067: 100%|██████████| 229/229 [00:02<00:00, 109.95it/s]
Validating: Epoch 074 | Batch 050 | Loss 0.05598: 100%|██████████| 58/58 [00:00<00:00, 91.13it/s] 
Training: Epoch 075 | Batch 220 | Loss 0.00051: 100%|██████████| 229/229 [00:02<00:00, 108.97it/s]
Validating: Epoch 075 | Batch 050 | Loss 0.06050: 100%|██████████| 58/58 [00:00<00:00, 95.13it/s] 
Training: Epoch 076 | Batch 220 | Loss 0.00064: 100%|██████████| 229/229 [00:02<00:00, 108.93it/s]
Validating: Epoch 076 | Batch 050 | Loss 0.06296: 100%|██████████| 58/58 [00:00<00:00, 89.37it/s] 
Training: Epoch 077 | Batch 220 | Loss 0.00054: 100%|██████████| 229/229 [00:02<00:00, 108.82it/s]
Validating: Epoch 077 | Batch 050 | Loss 0.05965: 100%|██████████| 58/58 [00:00<00:00, 90.24it/s] 
Training: Epoch 078 | Batch 220 | Loss 0.00039: 100%|██████████| 229/229 [00:02<00:00, 108.07it/s]
Validating: Epoch 078 | Batch 050 | Loss 0.06089: 100%|██████████| 58/58 [00:00<00:00, 99.90it/s] 
Training: Epoch 079 | Batch 220 | Loss 0.00059: 100%|██████████| 229/229 [00:02<00:00, 109.33it/s]
Validating: Epoch 079 | Batch 050 | Loss 0.05865: 100%|██████████| 58/58 [00:00<00:00, 86.68it/s]
Training: Epoch 080 | Batch 220 | Loss 0.00047: 100%|██████████| 229/229 [00:02<00:00, 107.88it/s]
Validating: Epoch 080 | Batch 050 | Loss 0.05940: 100%|██████████| 58/58 [00:00<00:00, 91.40it/s] 
Training: Epoch 081 | Batch 220 | Loss 0.00038: 100%|██████████| 229/229 [00:02<00:00, 109.83it/s]
Validating: Epoch 081 | Batch 050 | Loss 0.05895: 100%|██████████| 58/58 [00:00<00:00, 92.55it/s] 
Training: Epoch 082 | Batch 220 | Loss 0.00053: 100%|██████████| 229/229 [00:02<00:00, 107.74it/s]
Validating: Epoch 082 | Batch 050 | Loss 0.05899: 100%|██████████| 58/58 [00:00<00:00, 94.31it/s] 
Training: Epoch 083 | Batch 220 | Loss 0.00044: 100%|██████████| 229/229 [00:02<00:00, 108.30it/s]
Validating: Epoch 083 | Batch 050 | Loss 0.05881: 100%|██████████| 58/58 [00:00<00:00, 100.19it/s]
Training: Epoch 084 | Batch 220 | Loss 0.00043: 100%|██████████| 229/229 [00:02<00:00, 107.61it/s]
Validating: Epoch 084 | Batch 050 | Loss 0.05924: 100%|██████████| 58/58 [00:00<00:00, 92.54it/s] 
Training: Epoch 085 | Batch 220 | Loss 0.00034: 100%|██████████| 229/229 [00:02<00:00, 108.42it/s]
Validating: Epoch 085 | Batch 050 | Loss 0.05848: 100%|██████████| 58/58 [00:00<00:00, 89.88it/s] 
Training: Epoch 086 | Batch 220 | Loss 0.00037: 100%|██████████| 229/229 [00:02<00:00, 109.04it/s]
Validating: Epoch 086 | Batch 050 | Loss 0.06038: 100%|██████████| 58/58 [00:00<00:00, 91.18it/s] 
Training: Epoch 087 | Batch 220 | Loss 0.00044: 100%|██████████| 229/229 [00:02<00:00, 106.95it/s]
Validating: Epoch 087 | Batch 050 | Loss 0.05913: 100%|██████████| 58/58 [00:00<00:00, 91.68it/s] 
Training: Epoch 088 | Batch 220 | Loss 0.00040: 100%|██████████| 229/229 [00:02<00:00, 107.78it/s]
Validating: Epoch 088 | Batch 050 | Loss 0.06053: 100%|██████████| 58/58 [00:00<00:00, 95.34it/s] 
Training: Epoch 089 | Batch 220 | Loss 0.00036: 100%|██████████| 229/229 [00:02<00:00, 109.27it/s]
Validating: Epoch 089 | Batch 050 | Loss 0.06017: 100%|██████████| 58/58 [00:00<00:00, 97.60it/s] 
Training: Epoch 090 | Batch 220 | Loss 0.00035: 100%|██████████| 229/229 [00:02<00:00, 107.14it/s]
Validating: Epoch 090 | Batch 050 | Loss 0.05965: 100%|██████████| 58/58 [00:00<00:00, 85.56it/s]
Training: Epoch 091 | Batch 220 | Loss 0.00029: 100%|██████████| 229/229 [00:02<00:00, 106.66it/s]
Validating: Epoch 091 | Batch 050 | Loss 0.06056: 100%|██████████| 58/58 [00:00<00:00, 100.32it/s]
Training: Epoch 092 | Batch 220 | Loss 0.00033: 100%|██████████| 229/229 [00:02<00:00, 108.75it/s]
Validating: Epoch 092 | Batch 050 | Loss 0.06281: 100%|██████████| 58/58 [00:00<00:00, 92.45it/s] 
Training: Epoch 093 | Batch 220 | Loss 0.00033: 100%|██████████| 229/229 [00:02<00:00, 108.48it/s]
Validating: Epoch 093 | Batch 050 | Loss 0.06108: 100%|██████████| 58/58 [00:00<00:00, 94.04it/s] 
Training: Epoch 094 | Batch 220 | Loss 0.00023: 100%|██████████| 229/229 [00:02<00:00, 107.43it/s]
Validating: Epoch 094 | Batch 050 | Loss 0.06025: 100%|██████████| 58/58 [00:00<00:00, 94.23it/s] 
Training: Epoch 095 | Batch 220 | Loss 0.00027: 100%|██████████| 229/229 [00:02<00:00, 110.26it/s]
Validating: Epoch 095 | Batch 050 | Loss 0.06144: 100%|██████████| 58/58 [00:00<00:00, 94.22it/s] 
Training: Epoch 096 | Batch 220 | Loss 0.00029: 100%|██████████| 229/229 [00:02<00:00, 107.99it/s]
Validating: Epoch 096 | Batch 050 | Loss 0.06020: 100%|██████████| 58/58 [00:00<00:00, 96.98it/s] 
Training: Epoch 097 | Batch 220 | Loss 0.00029: 100%|██████████| 229/229 [00:02<00:00, 109.88it/s]
Validating: Epoch 097 | Batch 050 | Loss 0.06188: 100%|██████████| 58/58 [00:00<00:00, 94.59it/s] 
Training: Epoch 098 | Batch 220 | Loss 0.00026: 100%|██████████| 229/229 [00:02<00:00, 109.33it/s]
Validating: Epoch 098 | Batch 050 | Loss 0.06273: 100%|██████████| 58/58 [00:00<00:00, 94.84it/s] 
Training: Epoch 099 | Batch 220 | Loss 0.00026: 100%|██████████| 229/229 [00:02<00:00, 108.31it/s]
Validating: Epoch 099 | Batch 050 | Loss 0.06119: 100%|██████████| 58/58 [00:00<00:00, 94.30it/s] 
Training: Epoch 100 | Batch 220 | Loss 0.00025: 100%|██████████| 229/229 [00:02<00:00, 108.56it/s]
Validating: Epoch 100 | Batch 050 | Loss 0.06252: 100%|██████████| 58/58 [00:00<00:00, 94.96it/s] 
Looping finished

In [ ]:
plot.plot_loss(trainer_cnn_scheduling.log)
plot.plot_metrics(trainer_cnn_scheduling.log)
No description has been provided for this image
No description has been provided for this image
In [ ]:
def plot_lr(lr):
    din_a4 = np.array([210, 297]) / 25.4
    fig = plt.figure(figsize=din_a4)

    def subplot_lr():
        ax = plt.gca()

        ax.set_xlabel("Steps/Epochs", fontsize=9)
        ax.set_ylabel("Learning rate", fontsize=9)
        ax.tick_params(axis="both", which="major", labelsize=9)
        ax.tick_params(axis="both", which="minor", labelsize=8)
        ax.grid(alpha=0.4)

        ax.plot(lr)

    fig.add_subplot(3, 1, 1)
    subplot_lr()

    plt.tight_layout()
    plt.show()


plot_lr(trainer_cnn_scheduling.scheduler.log)
No description has been provided for this image

Evaluation¶

In [ ]:
_, model_cnn_scheduling, _, _ = utils_checkpoints.load(path_dir_exp_cnn_scheduling / "checkpoints" / "final.pth")
print(torchsummary.summary(model_cnn_scheduling, config.MODEL["kwargs"]["shape_input"]))

evaluator_cnn_scheduling = Evaluator("svhn_cnn_scheduling", model_cnn_scheduling)
evaluator_cnn_scheduling.evaluate()

print(f"Loss on test data: {evaluator_cnn_scheduling.log["total"]["loss"]}")
print(f"Metrics on test data")
for name, metrics in evaluator_cnn_scheduling.log["total"]["metrics"].items():
    print(f"    {name:<10}: {metrics}")
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
├─Sequential: 1-1                        [-1, 128, 4, 4]           --
|    └─BlockCNN2d: 2-1                   [-1, 32, 16, 16]          --
|    |    └─Conv2d: 3-1                  [-1, 32, 32, 32]          2,432
|    |    └─InstanceNorm2d: 3-2          [-1, 32, 32, 32]          --
|    |    └─ReLU: 3-3                    [-1, 32, 32, 32]          --
|    |    └─MaxPool2d: 3-4               [-1, 32, 16, 16]          --
|    └─BlockCNN2d: 2-2                   [-1, 64, 8, 8]            --
|    |    └─Conv2d: 3-5                  [-1, 64, 16, 16]          51,264
|    |    └─InstanceNorm2d: 3-6          [-1, 64, 16, 16]          --
|    |    └─ReLU: 3-7                    [-1, 64, 16, 16]          --
|    |    └─MaxPool2d: 3-8               [-1, 64, 8, 8]            --
|    └─BlockCNN2d: 2-3                   [-1, 128, 4, 4]           --
|    |    └─Conv2d: 3-9                  [-1, 128, 8, 8]           204,928
|    |    └─InstanceNorm2d: 3-10         [-1, 128, 8, 8]           --
|    |    └─ReLU: 3-11                   [-1, 128, 8, 8]           --
|    |    └─MaxPool2d: 3-12              [-1, 128, 4, 4]           --
├─MLP: 1-2                               [-1, 10]                  --
|    └─Sequential: 2-4                   [-1, 10]                  --
|    |    └─Flatten: 3-13                [-1, 2048]                --
|    |    └─Linear: 3-14                 [-1, 10]                  20,490
==========================================================================================
Total params: 279,114
Trainable params: 279,114
Non-trainable params: 0
Total mult-adds (M): 29.25
==========================================================================================
Input size (MB): 0.01
Forward/backward pass size (MB): 0.44
Params size (MB): 1.06
Estimated Total Size (MB): 1.51
==========================================================================================
==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
├─Sequential: 1-1                        [-1, 128, 4, 4]           --
|    └─BlockCNN2d: 2-1                   [-1, 32, 16, 16]          --
|    |    └─Conv2d: 3-1                  [-1, 32, 32, 32]          2,432
|    |    └─InstanceNorm2d: 3-2          [-1, 32, 32, 32]          --
|    |    └─ReLU: 3-3                    [-1, 32, 32, 32]          --
|    |    └─MaxPool2d: 3-4               [-1, 32, 16, 16]          --
|    └─BlockCNN2d: 2-2                   [-1, 64, 8, 8]            --
|    |    └─Conv2d: 3-5                  [-1, 64, 16, 16]          51,264
|    |    └─InstanceNorm2d: 3-6          [-1, 64, 16, 16]          --
|    |    └─ReLU: 3-7                    [-1, 64, 16, 16]          --
|    |    └─MaxPool2d: 3-8               [-1, 64, 8, 8]            --
|    └─BlockCNN2d: 2-3                   [-1, 128, 4, 4]           --
|    |    └─Conv2d: 3-9                  [-1, 128, 8, 8]           204,928
|    |    └─InstanceNorm2d: 3-10         [-1, 128, 8, 8]           --
|    |    └─ReLU: 3-11                   [-1, 128, 8, 8]           --
|    |    └─MaxPool2d: 3-12              [-1, 128, 4, 4]           --
├─MLP: 1-2                               [-1, 10]                  --
|    └─Sequential: 2-4                   [-1, 10]                  --
|    |    └─Flatten: 3-13                [-1, 2048]                --
|    |    └─Linear: 3-14                 [-1, 10]                  20,490
==========================================================================================
Total params: 279,114
Trainable params: 279,114
Non-trainable params: 0
Total mult-adds (M): 29.25
==========================================================================================
Input size (MB): 0.01
Forward/backward pass size (MB): 0.44
Params size (MB): 1.06
Estimated Total Size (MB): 1.51
==========================================================================================
Setting up dataloader...
Test dataset
Dataset SVHN
    Number of datapoints: 26032
    Root location: /home/user/karacora/lab-vision-systems-assignments/assignment_2/data/svhn
    Split: test
    Transform: Compose(
    ToTensor()
)
    Transform of target: Compose(
)
Setting up dataloader finished
Setting up criterion...
Setting up criterion finished
Setting up measurers...
Setting up measurers finished
Validating: Batch 100 | Loss 0.35724: 100%|██████████| 102/102 [00:00<00:00, 106.22it/s]
Loss on test data: 0.4602549399747831
Metrics on test data
    Accuracy  : 0.9083819913952059

Discussion¶

The scheduler works as desired. However, I did not manage to mitigate overfitting using this scheduler. The evaluation result is slightly worse than the previously considered models, except for the L1 regularization. I think my model is "too powerful" and I handled the hyperparameter optimization in a way that makes it hard to mitigate overfitting. One solution would be to use optuna with more epochs but this takes much time.